New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export pages as a simple MediaWiki markup file #107

Open
marespiaut opened this Issue Jun 6, 2016 · 11 comments

Comments

Projects
None yet
3 participants
@marespiaut

marespiaut commented Jun 6, 2016

Hello,
I often use pandoc to convert generated ePubs into different file formats that are not supported by WSexport, but it's a very hacky process.
It would be interesting to be able to download several pages as a single text file containing the original MediaWiki markup text. This would simplify such a task.

@Tpt

This comment has been minimized.

Member

Tpt commented Jun 7, 2016

Sure, it's definitely something possible.

But, before adding this feature. What are the file formats you want to target? If they are supported by Calibre ebook-convert (AZW3/DOCX/EPUB/FB2/HTML/LIT/LRF/MOBI/OEB/PDB/PDF/PML/RB/RTF/SNB/TCR/TXT) we could add support for them very easily to Wsexport.

@marespiaut

This comment has been minimized.

marespiaut commented Jun 7, 2016

Hello,

I use pandoc to convert ePubs as LaTeX .tex files. I don't know if Calibre support this.

@samwilson

This comment has been minimized.

Contributor

samwilson commented Jun 7, 2016

@marespiaut out of interest, what cool stuff are you doing converting Wikisource to Latex? Sounds interesting! :)

@Tpt

This comment has been minimized.

Member

Tpt commented Jun 7, 2016

Sadly it seems that Calibre does not support it. As Wsexport has been written to work on the HTML output of MediaWiki instead of the Wikitext content, modify it to get a wikitext/LaTeX output seems to be quite difficult. So, I won't commit to do it anytime soon. Sorry :-(

@marespiaut

This comment has been minimized.

marespiaut commented Jun 7, 2016

@Tpt I understand, but I'm not asking for LaTeX output, but for unmodified MediaWiki markup output.

@marespiaut

This comment has been minimized.

marespiaut commented Jun 7, 2016

@samwilson I print pages and make book from scratch for my personal use. Most (if not all) books I contribute to are out of print and impossible to buy second hand. LaTeX allows me to choose the page size, the font, the margin, etc… I then print the pages and bind them by hand.

@samwilson

This comment has been minimized.

Contributor

samwilson commented Jun 7, 2016

@marespiaut I thought perhaps it was something like that :-) Sounds great. Do you automate the process, so you can easily apply later text-changes to your latex files?

@Tpt

This comment has been minimized.

Member

Tpt commented Jun 7, 2016

@Tpt I understand, but I'm not asking for LaTeX output, but for unmodified MediaWiki markup output.

Yes, I was saying that Wsexport works directly with the HTML rendering of Wikisource pages and not with Wikitext (MediaWiki makup), and so, modify it to be able to output Wikitext will be quite painful. But if you are interested in plain HTML, that's definitely possible (and very easy to implement).

@marespiaut

This comment has been minimized.

marespiaut commented Jun 7, 2016

@Tpt My bad, I misunderstood! Of course it would be very interesting to see how pandoc performs on it.

@marespiaut

This comment has been minimized.

marespiaut commented Jun 7, 2016

@samwilson The process isn't automatic so far. I have to work it by hand. But pandoc is customizable in Lua. I haven't checked that yet.

@Tpt

This comment has been minimized.

Member

Tpt commented Jun 7, 2016

@marespiaut I've just added support of HTMLZ. It's a ZIP archive that contains a single HTML file with the full content of the book (+ a CSS file and the images)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment