Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

space lost #24

Closed
gamboz opened this issue Dec 5, 2017 · 9 comments
Closed

space lost #24

gamboz opened this issue Dec 5, 2017 · 9 comments

Comments

@gamboz
Copy link

gamboz commented Dec 5, 2017

In the following docx file, the space between "40" and "MHz" is lost:
https://medialab.sissa.it/owncloud/index.php/s/zkxFGDvNAehVatl
I'm not sure if it is an error, but I'm reporting it because the appearance of the tex/pdf and docx file differ.

@mkraetke
Copy link
Member

mkraetke commented Dec 5, 2017

Thanks for the report, I'm investigating your issue.

@gimsieke
Copy link
Contributor

gimsieke commented Dec 5, 2017

Seems to be an issue of omml2mml

@mkraetke
Copy link
Member

mkraetke commented Dec 5, 2017

The issue is that our docx2hub module converts

<m:t xml:space="preserve">40 </m:t>

to

<mml:mn>40</mml:mn>
<mml:mi> </mml:mi>

I think the whitespace should be coded either as \ or \text{ }

@gimsieke
Copy link
Contributor

gimsieke commented Dec 5, 2017

Maybe we should convert an mi that only contains (significant) whitespace to mtext or mspace. Then the TeX code will probably be ok.
There’s an mml-space-handling option in docx2hub.xpl. We currently pass it only to the MathType converter. This option should eventually be passed to omml2mml.xsl, too (and acted upon accordingly).
But turning it into an mtext for now is probably the quickest solution.

@mkraetke
Copy link
Member

mkraetke commented Dec 5, 2017

I resolved one issue, omml2mml.xsl converts the m:t with whitespace now to

<mml:mn>40</mml:mn>
<mml:mtext xml:space="preserve"> </mml:mtext>
<mml:mtext>MHz</mml:mtext>

Unfortunately, there seems to be a bit of MathML normalization in our pipeline, which drops the mtext. I'll investigate this further.

@gimsieke
Copy link
Contributor

gimsieke commented Dec 5, 2017

I just noticed that mml-space-handling is already being honored! If you just invoke docx2hub.xpl, the default setting of mspace will kick in and the resulting expression will look like:

<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline">
   <mml:mn>40</mml:mn>
   <mml:mspace width="0.25em"/>
   <mml:mtext>MHz</mml:mtext>
</mml:math>

I haven’t tested it with the full docx2tex pipeline though.

@mkraetke
Copy link
Member

mkraetke commented Dec 6, 2017

This option was set to xml-space, so <mml:mtext xml:space="preserve"> </mml:mtext> should be the appropriate output with regard to this value. Unfortunately, after I've fixed the MathML normalization, there were lots of \text{} environments in our test data. This is the case when authors work with text style in the equation editor when it is not necessary. To write 40 Mhz you do not need an equation editor at all.

However, I've changed mml-space-handling from xml-space to mspace for docx2tex which results in less unintended text{} environments, where authors wrote their equations sloppy. Finally, the equation now reads as follows:

detector at $40\:\mathrm{MHz}$, i.e.,

@mkraetke mkraetke closed this as completed Dec 6, 2017
@gamboz
Copy link
Author

gamboz commented Dec 14, 2017

Hi, thank you for the fast solution.

I'm not sure if it is related to this issue, but since the last commit, my clone of docx2tex fails.
I've also tried with a new pristine checkout.
The errors are related to the conf.csv not validating and to a "Undeclared variable in XPath expression: $image-output-dir".
The first error disappears if I specify the conf.xml file with the "-c" options of d2t

Please find the docx file and the d2t log here:
https://medialab.sissa.it/owncloud/index.php/s/6I6rKxHflXeu3co

@mkraetke
Copy link
Member

The bug is fixed, I've added an option recently to pass a custom image directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants