Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first version of xml, xhtml and epub export tagging support #34

Closed
wants to merge 2 commits into from

Conversation

hernot
Copy link
Contributor

@hernot hernot commented Apr 14, 2020

late happy easter

Managed to add taggs and css files to support xml/xhtml export and epub generation

This is a very first version. But line numbering will still need some indeep attention. currently only tested with number location left. Which works outside floats but already beraks inside ()
And i do expect even wors breaking for any other remaining placement (
)

(*) line numbers are not placed immediately infront or after trags but immediately after float tag and before end float tag.

But for some advice it should be fair and OK enough.

SYNBOL and SYNEOL now insert a <div class="syntaxlinegroup"> tag
in the xml exporter output. The most tricky part was to ensure
independent of whether linenumbering is on or off that each line
is rendereed as distinct line by the browser and epub viewer.
Finally solved by makine SYNEOL insert a <div class=break> tag
which is used by the exporter anyway to make the browser render
distinct lines.
A separate css generated for each colorscheme defined. Ensuring
that coloring of code is as much as possible matching between
xml/xhtml/epub and pdf output.

TODO: linenumbering breaks when placed inside float see example
tests/vim/40-SYNBOL-SYNEOL-ln-tagging-export-float.tex. Likely
some tag or some other additional settings have to be made that
also in xml/xhtml output the line number injection works properly
and line numbers are added before or after each break tag. Currently
only tested with line numbers placed left. Other placements may
cause even more errors.
@adityam
Copy link
Owner

adityam commented Apr 15, 2020

This is very exciting! Thanks. I should have time next week, when I'll go over both your pull requests.

@adityam
Copy link
Owner

adityam commented Apr 19, 2020

This commit only contains the test files. Did you forget to include some files?
Regarding linenumbering: If the output is correct without linenumbering, then in the worst case linenumbering can be done at the CSS/HTML level as well. Not the ideal solution, but ....

Copy link
Owner

@adityam adityam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments based on the first reading

subtitle={},
author={J.R.C.H.}
]
\usemodule[amsl]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you add amsl. That module is not needed for this test and is deprecated anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy paste from other source. Initially i thought to also demonstrate but that would be something for later more advanced test files if necesary at all. can go.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please remove this from all test files.

]
\usemodule[amsl]
\usemodule[vim]
\definevimtyping[Python][syntax=python,tab=4,numbering=no]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard style that I have followed in all the tests is to use capital case for language names. Could you change this to
\definevimtyping[PYTHON][....]

\usemodule[vim]
\definevimtyping[Python][syntax=python,tab=4,numbering=no]
\starttext
This is a little test whether modifcation in \type{\SYNBOL} and \type{\SYNEOL} have any sideeffects
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sideeffects -> side effects

Comment on lines 1 to 4
\setupbackend[
export=yes,
xhtml=yes
]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also follow the formatting style for the rest of the tests. Basically:

\setupbackend
    [
      export=yes,
      xhtml=yes,
    ]

subtitle={},
author={J.R.C.H.}
]
\usemodule[amsl]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, not needed.

a=12
def hello_world(hello)
print("simon says: {}".format(hello))
hallo_welt("how do youdo")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point: but shouldn't this be hello_world ... also youdo -> 'you do`

@adityam
Copy link
Owner

adityam commented Apr 23, 2020

This commit only contains test files. There is no file which actually implements the tagging. Perhaps you forgot to add it to your branch. Could you please check it.

Thanks

Adds the files missed in last commit and updates the test files.
Line numbering still a wild hack and not working within floats and
framed environment. Still work in progress
@hernot
Copy link
Contributor Author

hernot commented Apr 23, 2020

Ouups missed to add them should be solved for now, squash in final commit

@adityam
Copy link
Owner

adityam commented Apr 29, 2020

Which version of ConTeXt (context --version) are you using?

With MkIV 2020.01.30, the files do not compile. For example:

context tests/vim/38-SYNBOL-SYNEOL-ln-tagging-export.tex

gives

tex error       > tex error on line 8 in file tests/vim/38-SYNBOL-SYNEOL-ln-tagging-export.tex: ! Undefined control sequence

<recently read> \dosynchronizeexport 
                     
\vimtyping@define ...ingparameter \c!alternative }
                                                  \setvalue {\e!start raw#1}...
l.8 ...hon,tab=4,numbering=yes,numberlocation=right]

I get a similar error with LMTX 2020.04.27.

@hernot
Copy link
Contributor Author

hernot commented Apr 30, 2020 via email

@adityam
Copy link
Owner

adityam commented Apr 30, 2020 via email

@hernot
Copy link
Contributor Author

hernot commented May 16, 2020

Ok just had some time to test home.
The source of error is in \setup_export command definee by t-syntax-groups.
On about line 326 the \doifmode{*export} clause defines the method inject_css_file which calls at the end (following the comment) \dosynchronizeexport.
The last command is (still) known in 2018 release of Context but has been removed in Current release (2020).

By replacing
\dosynchronizeexport
by
\doifdefined{dosynchronizeexport}{\dosynchronizeexport}
The error should be solved.

But the exporter seemst to have some problems with newline characters as it complains about possible paragraph mixup and it does not output any tag to html/xhtml/xml strangely. Not sure if related.

@adityam
Copy link
Owner

adityam commented May 18, 2020

If I change that line, I do not get any elements in the exported XML. Did you make any changes in t-filter or t-vim?

@hernot
Copy link
Contributor Author

hernot commented May 18, 2020

Thats exactly the point where I'm currently struggling with 2020 ConTeXt version (in 2018 version it seemed at least still to work by chance). The accompanying pdf looks as should be. The only thing which i do see with my naive trial and error approach is by turning on export trackers is that it has some problem that the code block would sort before the first paragraph and that seems to be disliked by the exporter.
That is why i have sent an email concerning the exporter to the mailing list. But I'm not sure if to ask development questions the normal mailing list is the best place. Especially when it comes to the need to understand how low level things like the exporter work or how the document tree looks like it has to process and choke on.

Do you have an advice which channel to use for asking these kind of questions?

@adityam
Copy link
Owner

adityam commented May 18, 2020 via email

@hernot
Copy link
Contributor Author

hernot commented May 19, 2020

Exactly, i was too clueless and disoriented to be able to get anywhere close to be able to break down to anything helpfull. I do fear for now I'm just the apprentice having to whatch and learn and not much help otherwise. Cause even with the answers you got i'm still as clue less and lost as before.

@hernot
Copy link
Contributor Author

hernot commented May 25, 2020

By the way where do i find documentatoin how to properly debug my contributions, how do is get to know what of my thoughts about context internal workings are wrong and where do i find documentation about the designprinciples of context any of its modules.

Is the mailing list the only source?

@adityam
Copy link
Owner

adityam commented May 25, 2020 via email

@adityam
Copy link
Owner

adityam commented May 30, 2020

Okay, first step towards the missing features. I pushed a new version 4772457, which now fixes the spaces in export. Now looking at line numbers.

@adityam
Copy link
Owner

adityam commented May 31, 2020

Linenumbering is also sorted. Was a backend issue. I am waiting for the context distribution to be updated and then we can test the export functionality.

@adityam
Copy link
Owner

adityam commented May 31, 2020

See branch export-2. It works here for simple examples (with some local changes to strc-tag.lua and back-exp.lua). For example, the file:

\setupbackend[export=xml]

\usemodule[vim]

\definevimtyping[PYTHON][syntax=python]

\starttext
\startsection[title={A code listing}]
  \startparagraph
    This is a code listing
  \stopparagraph
  \startPYTHON
    # Python program listing
    def foobar
        print("Hello World")
  \stopPYTHON
  
  \startparagraph
    Now the same example with line numbering
  \stopparagraph
  \startPYTHON[numbering=yes]
    # Python program listing
    def foobar
        print("Hello World")
  \stopPYTHON
\stopsection
\stoptext

gets exported to

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>

<!--

    input filename   : test
    processing date  : 2020-05-31T14:19:54-04:00
    context version  : 2020.05.25 23:39
    exporter version : 0.35

-->

<!-- This export file is used for filtering runtime only! -->

<document context="2020.05.25 23:39" date="2020-05-31T14:19:54-04:00" file="test" language="en" version="0.35" xmlns:m="http://www.w3.org/1998/Math/MathML">
 <metadata>
 </metadata>
 <section detail="section" chain="section" implicit="1" level="3">
  <sectioncaption>
   <sectionnumber>1</sectionnumber>
   <sectiontitle>A code listing</sectiontitle>
  </sectioncaption>
  <sectioncontent>
   <paragraph>This is a code listing</paragraph>
   <vimtyping detail="PYTHON">
    <verbatimline><syntaxgroup detail="Comment"># Python program listing</syntaxgroup></verbatimline>
    <verbatimline><syntaxgroup detail="Statement">def</syntaxgroup> <syntaxgroup detail="Function">foobar</syntaxgroup></verbatimline>
    <verbatimline>    <syntaxgroup detail="Function">print</syntaxgroup>(<syntaxgroup detail="String">"</syntaxgroup><syntaxgroup detail="String">Hello World</syntaxgroup><syntaxgroup detail="String">"</syntaxgroup>)</verbatimline>
   </vimtyping>
   <paragraph>Now the same example with line numbering</paragraph>
   <vimtyping detail="PYTHON">
    <verbatimline><linenumber>1</linenumber><syntaxgroup detail="Comment"># Python program listing</syntaxgroup></verbatimline>
    <verbatimline><linenumber>2</linenumber><syntaxgroup detail="Statement">def</syntaxgroup> <syntaxgroup detail="Function">foobar</syntaxgroup></verbatimline>
    <verbatimline><linenumber>3</linenumber>    <syntaxgroup detail="Function">print</syntaxgroup>(<syntaxgroup detail="String">"</syntaxgroup><syntaxgroup detail="String">Hello World</syntaxgroup><syntaxgroup detail="String">"</syntaxgroup>)</verbatimline>
   </vimtyping>
  </sectioncontent>
 </section>
</document>

@hernot
Copy link
Contributor Author

hernot commented Jun 1, 2020 via email

@adityam
Copy link
Owner

adityam commented Jun 2, 2020 via email

@hernot
Copy link
Contributor Author

hernot commented Jun 2, 2020

I still haven't looked at generating CSS. Since the module only provides a two built-in color schemes, I was actually thinking of including a hand coded CSS rather than an automatically generated one. But I need to look into that.

Unless changes in colour management are as radical as for tagging in latest mkiv and lmtx than in theory my suggestion on how to create css from colourscheme setup in context and hook it to css output list should still work. In other words no need for static css and for special cases users still can manually attach css through cssfile option of export.

I'd also appreciate your help with testing to make sure there is no corner case.

Ok please keep me posted on progress and on which dedicated tasks or issues my support would be needed and/or apriciated.

@adityam
Copy link
Owner

adityam commented Jun 15, 2020

The code now works with the latest version of ConTeXt. Can you test the export-2 branch. See tests/41-export.tex for a simple example.

Please note that I have not yet added the CSS functionality (haven't had the time). I'll try to test your code and then add something appropriate. Note that ConTeXt does generate a bare-bones CSS definitions in *-epxort/style/filename-template.css.

@hernot
Copy link
Contributor Author

hernot commented Jun 15, 2020

Yes i will test with latest an report any

Yes i know about *-epxort/style/filename-template.css. I decided against it for the following probably selfish reasons.

  • It is not automatically included in css used by output and has to be added manually through cssfile parameter of export by user.

  • I have no idea if it is recreated on every run or only created if not present.

  • If recreated only if missing than i consider it to be a reminder for the user to collect all his specific css stuff which is not provided by ConTeXt or any of its packages therein.

  • In case recreated every time why user still has to add manually? And what if user does not add, what does that mean for typesetting of code?

  • What side effects would occur if t-vim would automatically add to ensure for proper display of code?

Given these i could not fully figure i decided against utilizing this file for placing vimtyping related stuff. And yes I admit i might have been a bit lazy and lousy to not approach those via Mailing list who could have helped to obtain more clarity upon above points ;-)

@hernot
Copy link
Contributor Author

hernot commented Jun 17, 2020

Ok seems to work eventhough all syntaxgroups are for now maked as verbatim lines. Are there any obstacles where your expertise would be needed or is that something i could play with at least?

Just for understanding: shouldn't 41-export.tex be placed in tests/vim as it is a tests for vimtyping and not basic t-filter? or is vimtyping just the vehicle or better the environment allowing to test the underlying changes to t-filter?

@adityam
Copy link
Owner

adityam commented Jun 17, 2020 via email

@hernot
Copy link
Contributor Author

hernot commented Jun 17, 2020

Ah yes you are right should have read carefully. And the rest is a matter of css.
In html it appears with every elment on a single line as if firefox would only see block setting for vimtyping and not get the css div entries.

whereas in xhtml it seems to recognize the corresponding css settnings.

As far as i remember i had similiar struggles but i think it should resolve by dropping div, but I'm not quit sure how i got rid of that strange behaviour anymore. I just remember that it was gone somehow.

If you want to compare i think you would have to replace syntaxline by verbatimline. But not sure whether this helps.

@adityam
Copy link
Owner

adityam commented Jun 17, 2020 via email

@hernot
Copy link
Contributor Author

hernot commented Jun 18, 2020

Ok I'll re figure and pullrequest if necessary against your branch as soon as i think i do have a usable solution.

EDIT
The reason why not respecting css is that the templates have to be added manually by including in the preamble the following line

\setupexport[cssfile=41-export-templates.css]

That was one reason why i added the automatic adding of vimtyping styles, as either user has his own custom css not related at all to vimtyping or has no clue how to get colorscheme right.

The following finding or suggestions i do have but for some would need your further help to make work properly at least for some.

  1. i would use the colorscheme as detail to vimtyping tags as language is already converted into <syntaxgroup detail=....> </syntaxgroup> or <div class="syntaxgroup ..."> </div> whereas color scheme would be lost. And in css it is possible to switch styles for syntaxgroup dependent upon within which outer tag it is enclosed.

  2. xhtml and html behave differntly when you set outer container (eg. <vimtyping detail="Python"> </vimtyping> ) display to display=block. and <verbatimline> </verbatimline> to display=inline

In html it is necessary to end each syntaxline by following it by a <div class="break"><!-- --></break> defined by context core to tell html to break line not sure if </br> also would work.

In xhtml this is not necessary.

I think here you will have to look into, how to tell core to inject either </br> tag or the <div class=break> tags when outputting html but not when outputting xhtml.

  1. xhtml refuses to keep line numbering which is written directly in context of <vimtyping detail=Python> </vimtyping> tag on the same line as the <verbatimline> </verbatimline>

Possibly line number injection should be changed a bit that linenumbers are placed within dedicated <div> tags. Not sure what the reason is why this is not done. Maybe Hans and others had some reasons for. I think this is also an issue for you.

  1. Attached the css modified such that it looks reasonable on xhtml.
    41-export-templates.css.gz
    As you can see i removed the div and defined the styles for classes only. And i limitted the verbatimline to specific vimtyping. and the same can be done for all the syntaxgroups to allow several colorschemes in the same document. But as soon as the other stuff is identified and possibly a solution could be found i can take care of proper css injection and creation/default styling.

EDIT
I think i managed for html with css only without explicit break tags see second version of css (at least in firefox)
41-export-templates.css.gz

But for xhtml i do fear that it would be necessary to enclose line numbers in a <linenumber> tag, possibly enclosing both number and <verbatimline> tag, or include them within the <verbatimline> tag to ensure that both stay on the same line. But for this i need some time as I'm worse in xhtml + css than html+css.

EDIT
Why so complicated. To get proper line breaks just set white-space property of vimtyping to white-space: pre; and all lines are typed as ar.
Further concerning the XHTML issue. My mistake just typo in the css. Had detail="Python" on the lines limiting formating for verbatimline to when occuring inside <vimtyping> tag instead of detail="PYTHON" as should have been. With these changes both html and XHTML are formated properly:

41-export-templates.css.gz

In other words if you are ok i can take care of proper css. And creating more export tests.
For css i would suggest if you are ok the following.:

  • styles are linked to colorscheme and not language as 2context.vim script anyway creates language independent TeX/ConTeXt code and thus only proper setting of colours fonts and others is required for proper syntaxhighlighting with is per se language independent.

  • append vimtyping.css files automatically to list of aditional cssfiles specified through cssfile parameter of \setupexport command. Reason: the *templates.css file is not automatically linked into the exported *.html and *.xhtml files. It seems as it is just what it name says a template ConTeXt'ers could use as a template for their own .css files they can attach through the cssfile parameter which can be a single file or a list of files. And as the colorscheme definition macros allow users of vimtyping to define their own colorscheme or modify some elements thereof. As a consecquence when they will do so they will also expect that their fancy new scheme is used on html, xhtml and epub which is based upon xhtml. And event if i select one of the two predefined i would expect that this is used on html and xhtml export too without having to manually link, write or even know any css at all as i do not have to on pdf either. Just my approach and opinion.

  • create a separate .lua file for all things to be handled in lua and link it to vimtyping related files instead of inlining luacode within tex and mkiv files.

  • do you know is there anything special to be done for lmtx or is lmtx automatically falling back to mkiv if no lmtx specific files and commands are found? Or should that be postponed still for future?

@hernot
Copy link
Contributor Author

hernot commented Jun 23, 2020

I'm nearly there to make things work as expected for proper display of typeset syntax in export to html, xhtml and xml. Only one little thing is to be solved where i do need your advice and guidance how to make it work without breaking anything.

The t-vim module is based upon t-filter and the outer most set of <div> tags for each syntax display is placed by t-filter.
It assigns to the outer most <div> tag on line 488 in t-filter.tex

\dostarttagged{\externalfilterparameter\c!taglabel}\currentexternalfilter

The class name is set to the passed taglabel parameter and tag detail specifier is set to the content of \currentexternalfilter.
For vimtyping this holds the name of the language typeset eg. "PYTHON". But for css styling the name of language has no meaning at this stage. The language has already been converted into language independent structure of <div class="verbatimline"> and <div class="syntaxgroup [syntaxelement]"> tags throug 2context.vim script.

Therefore the tag for vimtiyping should better use colorsheme name as tagdetail information like so

<div class="vimtyping pscolor">

and in general

<div class="vimtyping [colorscheme]">

How should i, may i or do i have to modify or extend the related line in t-filter.tex to be able to select an alternative value for the tag detail value instead of the value stored in \currentexternalfilter.
Thereby i want to avoid no non desired side effects on other external filters defined elsewhere occur.

@adityam
Copy link
Owner

adityam commented Jun 23, 2020 via email

@hernot
Copy link
Contributor Author

hernot commented Jun 23, 2020

which ever option you prefer. Eventhough how often one really utilizes different colourscemes for listings of the same language. I think lets start with option 1 and if usecases occur for second than refactoring can be done. Or do you already have existing documents where colorscheme and language should be included within tagdetail?

Concerning \setupelements[properties=yes] is that specific to t-filter and t-vim or a global for ConTeXt. In case the latter i can check if that is set automatically when \setupbackend and \setupexport commands are used.

@adityam
Copy link
Owner

adityam commented Jun 23, 2020 via email

@hernot
Copy link
Contributor Author

hernot commented Jun 23, 2020

Are you sure about \setupelements[properties=yes] i did not find it neither in docs nor in the standalone code and docs? Where is it mentioned?

@adityam
Copy link
Owner

adityam commented Jun 23, 2020 via email

@hernot
Copy link
Contributor Author

hernot commented Jun 23, 2020

Ah now i get what you mean, but from reading epub-mkiv.pdf I'm not sure if that is relevant for anything beyond structures created through\startelement \stopelement at least that is how i would read last section 6. As it is only mentioned there and all examples shown there use \startelement \stopelement. And first Paragraph in this section says

"[...]The default output reflects the structure present in the document. If that is not enough you can add your own structure [...]"

And as i view it t-filter and t-vim output is structure which is already present in the document. So I'm not sure if the properties parameter of \setupexport would have any effect to its tags and how it would be possible to make it have an effect. And would that be necessary at all?

Cause epub is basically zip archive including all files required to properly display the document, with the xml describing the pages, and some more informatoin hosted by additional meta files. And xml and xhtml do not differ that much.

Maybe Hans can shed some light on it.

@adityam
Copy link
Owner

adityam commented Jun 24, 2020

There are two ways to implement tagging.

The first, which we currently follow in t-filter/t-vim is

\dostarttagging{name}{value}
...
\stopstoptagging

which gets translated to

<name detail=value>
...
</name>

The other option is

\startelement[name][option1=value1, option2=value2]
...
\stopelement

By default, this gets translated to

<name>
...
</name>

But, if \setupexport[properties=yes] is set, this gets translated to

...

However, if \setupexport[properties=prefix] is set, this gets translated to

...

I get the impression that \dostarttagging is for internal code while \startelement is for user code. The limitation of \dostarttagging is that we can only have one option. (There is also something which captures the inheritance structure with chain, but that might not be relevant for us).

If we use \startelement in t-filter, then (i) we need to enable \setupexport[properties=yes] but this might interfere with user code. For example, in a personal document, I might or might not want properties to be enabled globally. (ii) If the user sets \setupexport[properties=prefix], then the output of t-filter/t-vim gets affected. I don't like that.

I have implemented a version in export-2 using \dostarttagged so that the output is

<vimtyping detail="name-of-colorscheme>
....
</vimtyping>

@hernot
Copy link
Contributor Author

hernot commented Jun 24, 2020

So we do have the same impression that \startelement \stopelement is for document editors which want to add custom elements which have no corresponding item or element in other output formats like PDF. I would opt for sticking with \dostarttagged \dostoptagged for now as it is more consistent with standard ConTeXt and has less side effects. Could it be that the chain attributed are filled by the third variant \dostarttaggedchained Where the detail specifier is followed by a reference to another tag? Not sure if that has any value for vimtyping. But doesn't that for now deviate a bit from the initial idea/request to get comparable results from vimtyping on default pdf output and html, xhtml and xml output. In other words unless you have already some use-case or know somebody who already has, I'm fine with what is implemented so far using \dostarttaged and also fine if for now there would be an optional parameter for providing alternative content of tag detail attribute which as a default contains content of \currentexternalfilter as is now. Less efforts, hassle with sideeffects for you and no dependency upon the good will of user to enable properties on \setupexport and anything else would have to be coordinated and thoroughly discussed and designed together with Hans and others as this would mean an extension to ConTeXt exporter backend and tag system.

@adityam
Copy link
Owner

adityam commented Jun 24, 2020 via email

@hernot
Copy link
Contributor Author

hernot commented Jun 25, 2020

Oh i have an uptodate copy of your export-2 on my repo and in my local directory already css included. Just need to run some further tests and than i would place a new pull-request against your export-2 and close this one, if that is ok for you. That pullrequest would than include most if not everything needed for merging into main.

@adityam
Copy link
Owner

adityam commented Jun 25, 2020 via email

@hernot
Copy link
Contributor Author

hernot commented Jun 25, 2020

OK done by #38.

@hernot hernot closed this Jun 25, 2020
This was referenced Jun 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants