Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup and Refactor Locators #1013

merged 5 commits into from Jul 27, 2018


Copy link

@tkw1536 tkw1536 commented Jul 12, 2018

This pull request refactors the behaviour of locators. In particular, it turns locators into a new class LaTeXML::Common::Locator, that is then only ToStringed when displayed to the user.

The locator class also improves upon the old locators and is internally a quintuple consisting of

  • the source filename, or undef in case of an anonymous string source, or '' in case of a literal string source
  • the starting line and column numbers of the locator
  • the ending line and column numbers of the locator (or undef if not applicable)

This means that an instance of a locator class can refer either to a point, or an entire range in the source file. When requesting a Locator from the Mouth, the code tries to estimate the location and length of the last token, thereby returning a range.

The new locator class also has a toAttribute method, that turns an instance of a locator into an XPointer and thereby supersedes #739.

This commit removes several dead pieces of code within getLocator()
calls, notably within Core::Document, Core::Gullet, Core::Mouth,
Core::Stomach, Common::KeyVals and Common::Error. This optimization is
in preparation for a general cleanup of getLocator.

The entire code base of LaTeXML only ever calls getLocator with either
no arguments, $long = -1 or $long being passed through from the caller.
Hence, all getLocator subs can ignore all other cases.

Furthermore, all calls to getLocator() where the argument is -1 are
going to Mouth (or get directly passed through to one), hence all
non-mouth getLocator() methods can entirely ignore their argument and
assume it to be undef.

This leads to the following changes:

* Core::Document calls getLocator() on a box which takes no arguments.
Hence no arguments need to be passed through.
* Core:Gullet is only ever called with $long = undef, hence the code can
be simplified.
* Core::Mouth is only ever called with $long = undef or $long = -1,
hence one of the branches of the if() statement never occurs.
* Core::Stomach calls getLocator() on a gullet, which takes no
arguments.  Hence no arguments need to be passed through.
* Core::KeyVals calls getLocator() with an empty list of arguments, the
unused brackets can be removed.
* Common::Error made some calls to getLocator() with an empty list of
arguments, the unused brackets can be removed.
Previously, locators were strings. This made it difficult to work with
them in a programatic setting, or to extend them to contain source
ranges instead of points.

This commit updates the code to instead turn locators into a new
LaTeXML::Common::Locator class. It furthermore refactors the code, so
that the stringification of the locators only occurs upon output to the
Previously $$mouth{source} was "Literal String ..." or "Anonymous
String" in case of an anonymous String. This commit instead turns those
cases into undef, to make it much easier to programatically acces if a
source is anonymous or not.
Copy link

dginev commented Jul 12, 2018

Travis doesn't like your PR? Also, I am happy there is no new command-line option proposed, or I would need to link to the "ultimate PR response video", which Bruce countered me with some time ago.

Here it is for reference, and artistic enjoyment:

This commit adds basic support for locators representing ranges instead
of points to the LaTeXML::Common::Locator. It returns ranges for the
following two cases:
* LaTeXML::Common::Mouth records a locator representing the range of the
current token.
* LaTeXML::Common::WhatsIt records a locator that represents the range
of a body should it be present. Otherwise, it returns a point locator.

This commit also updates the behaviour of the stringify and toString
methods accordingly, and furthermore adds a new toAttribute method to
turn these locators into XPointer form for use within an XML Attribute.
Previously, we updated the source property of mouths to be undef in case of
anonymous strings.

This commit updates the previous behaviour and sets the mouth source to
be '' (the empty string) in case of a literal string.  This makes it
easier to see the difference between anonymous string inputs, which are
created on the fly by various bindings, and literal strings which can be
passed as valid inputs to LaTeXML.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

Successfully merging this pull request may close these issues.

None yet

3 participants