A minimal and easily extensible LaTeX parser.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
texla
.gitignore
MANIFEST.in
README.md
configs_example.yaml
doxy.conf
setup.py
unit_tests.py

README.md

texla

A minimal and easily extensible LaTeX parser.

It's minimal because it only splits tex without doing anything strange to the sources. It breaks down LaTeX into sections, environments, math, commands and plain text, creating a simple tree of Blocks objects.

It's easily extensible because to support a new command or environment the only necessary code is a Python class that defines a new Block. Moreover options and arguments of Latex commands and environments could be parsed with a simple and easy API.

Further documentation can be found at: https://meta.wikitolearn.org/Texla

Install Texla

You can use Texla from source or install it from PyPi.

pip install texla

The command texla will be available globally.

Run Texla

Just put a configs.yaml in your working directory and run texla from the command line. A --debug option is available for a more verbose output.

texla
texla --debug

Texla Configuration

The execution of texla is controlled by the configs.yaml file.

There are a few parameters to set:

  • renderer : The output format of the conversion. mediawiki is the only avaiable one for now

  • input_path : the path of the main tex file to convert. Texla is able to convert complex documents with many included subfiles: in "input_path" you have to put the main tex file that includes all the subfiles.

  • output_path : the path for the output files. Write it without the extension because it will be the base filename for all output files.

  • doc_title: The title of the document. Texla doesn't read the title written inside tex. :doc_title is used as a base namespace for pages

  • base_path: texla exports pages in an hierarchical way, organizing pages with unique urls. base_path is used as a root for the urls of the pages. It can be void.

  • collapse_content_level : The sectioning of a latex file is a tree. Every part of the tex doc has a level. The level of the root page, that contains the index of the document is -1. The first level of sectioning in the document has level 0. Texla converts the sections into pages and the page gets the level of the seciton. The content of the pages with level greater than collapse_content_level is inserted in the text of the parent page as a paragraph.

  • collapse_pages_level: If a page has a level greater than collape_pages_level and is not collapsed, it is moved to the level given by collapse_pages_level going up in the page tree.

  • create_index: if True a index is create in the root page.

  • export_format : for now text is the only avaiable

  • export_single_pages: if True a file for every page is created and saved in a directory called _"output_path"pages

  • export_pages_tree: if True the pages are exported in a tree of directory (root in _"output_path"pages ) corresponding to the actual sectioning.

  • export_book_page: If True the page necessary to Project:Books is created.

  • print_preparsed_tex: if True a debug file called preparsed.tex is saved with preparsed tex.

  • lang: localization for keywords. The avaiable languages are those inside i18n.yaml file. Contributions appreciated :)

  • plugins: [...] List of the enabled plugins. The order of this list is the order of executing: Be Careful.

  • plugins_configs: yaml dictionary containing the key-value configuration for each plugin (see configs_example.yaml)

Plugins

The available plugins are:

  • MathCheck: it fixes the math to be correct for WikiToLearn rendering

  • math_check_online: at the end of the rendering it calls the WikiToLearn math renderer to check if there are errors in the math.

  • space_check: it removes the single spaces after a newline.