New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--includestyles fix for latexmlc and 1011.5551 fixes #873

Merged
merged 3 commits into from Oct 22, 2017

Conversation

Projects
None yet
2 participants
@dginev
Collaborator

dginev commented Oct 1, 2017

I was investigating a Fatal conversion job from the cortex report over arXiv for a change, which happened to be a very impressive 85 manuscript:
https://arxiv.org/abs/1011.5551

They use a custom documentclass, and rather advanced latex, so we're far from "no problem" land, but can find interesting low-level fixes:

  1. I wanted to quickly check if running with --includestyles helps, and noticed that a) latexmlc does not enable the behavior when given the option, and b) the option doesn't propagate to LoadClass in general.

    • The first was due to a camelCase name mismatch, the second was just never added I think.
  2. Since enabling includestyles didn't really help (the loading got stuck in an infinite loop!), I next tried gradually improving coverage to avoid the Fatal error in the regular conversion pass from cortex, which is:

Fatal:unexpected:<endgroup> Attempt to pop last locked stack frame

It is related to a curious issue with using \sfcode.=1000` I think (at least that is the first obvious error), so I started adding small patches along the way. I haven't yet escaped the Fatal, so this PR may stay open for a little.

@dginev

This comment has been minimized.

Collaborator

dginev commented Oct 1, 2017

Another observation related to Fatals, on example cond-mat9807421, which had "too many errors" as the error. Turning --includestyles on for that example converted it into a very good-looking document with just a few math warnings.

I think we can seriously consider turning the --includestyles option on for the arXiv conversion, but there may be a smart way to do it. For example, we may continue running with it off by default, as now, and only if the document has Errors or Fatals, would the latexml worker rerun it with that option on. Just thinking out loud here...

@brucemiller

This comment has been minimized.

Owner

brucemiller commented Oct 2, 2017

Thinking out loud... I've been wondering what would happen if we processed latex.ltx. How slow? Errors? At least at the end all expected macros would be defined :>

@dginev

This comment has been minimized.

Collaborator

dginev commented Oct 2, 2017

Oh wow. That should be rather turbulent. But would be pretty darn awesome if we could claim we have all of LaTeX's internals actually loaded. Would worry about coverage/errors more than speed for the moment, I think we have more high level solutions for performance now.

@brucemiller brucemiller merged commit b1cf693 into brucemiller:master Oct 22, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@brucemiller

This comment has been minimized.

Owner

brucemiller commented Oct 22, 2017

Sorry... it's about time...

@dginev dginev deleted the dginev:includestyles-lowcaps branch Feb 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment