Skip to content
converts LaTeX into a Python parse tree, allowing navigation using the default or a custom hierarchy
Python TeX
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
tests Fixed Travis + Test Coverage (#11) Nov 19, 2017
tex2py Fixed Travis + Test Coverage (#11) Nov 19, 2017
.coveragerc Fixed Travis + Test Coverage (#11) Nov 19, 2017
.gitignore fixed #6 Sep 3, 2017
.travis.yml Fixed Travis + Test Coverage (#11) Nov 19, 2017
LICENSE Initial commit Apr 2, 2016
MANIFEST.in Update MANIFEST.in Feb 24, 2017
README.md fixes #10 Nov 17, 2017
pytest.ini Fixed Travis + Test Coverage (#11) Nov 19, 2017
setup.py 0.0.5 fixes #13 Nov 30, 2018

README.md

LaTeX2Python (tex2py)

Build Status Coverage Status

Tex2py converts LaTeX into a Python parse tree, using TexSoup. This allows you to navigate latex files as trees, using either the default or a custom hierarchy. See md2py for a markdown parse tree.

Note tex2py currently only supports Python3.

created by Alvin Wan

Installation

Install via pip.

pip install tex2py

Usage

LaTeX2Python offers only one function tex2py, which generates a Python parse tree from Latex. This object is a navigable, "Tree of Contents" abstraction for the latex file.

Take, for example, the following latex file. (See pdf)

chikin.tex

\documentclass[a4paper]{article}
\begin{document}

\section{Chikin Tales}

\subsection{Chikin Fly}

Chickens don't fly. They do only the following:

\begin{itemize}
\item waddle
\item plop
\end{itemize}

\section{Chikin Scream}

\subsection{Plopping}

Plopping involves three steps:

\begin{enumerate}
\item squawk
\item plop
\item repeat, unless ordered to squat
\end{enumerate}

\subsection{I Scream}

\end{document}

Akin to a navigation bar, the TreeOfContents object allows you to expand a latex file one level at a time. Running tex2py on the above latex file will generate a tree, abstracting the below structure.

          <Document>
          /        \
  Chikin Tales   Chikin Scream
      /            /     \
 Chikin Fly  Plopping   I Scream

At the global level, we can access the title.

>>> from tex2py import tex2py
>>> with open('chikin.tex') as f: data = f.read()
>>> toc = tex2py(data)
>>> toc.section
Chikin Tales
>>> str(toc.section)
'Chikin Tales'

Notice that at this level, there are no subsections.

>>> list(toc.subsections)
[]

The main section has two subsections beneath it. We can access both.

>>> list(toc.section.subsections)
[Chikin Fly, Chikin Scream]
>>> toc.section.subsection
Chikin Fly

The TreeOfContents class also has a few more conveniences defined. Among them is support for indexing. To access the ith child of an <element> - instead of <element>.branches[i] - use <element>[i].

See below for example usage.

>>> toc.section.branches[0] == toc.section[0] == toc.section.subsection
True
>>> list(toc.section.subsections)[1] == toc.section[1]
True
>>> toc.section[1]
Chikin Scream

You can now print the document tree. (There is some weirdness with branches beyond titles, so for only titles, we have the following:

           ┌Chikin Tales┐
           │            └Chikin Fly
 [document]┤
           │             ┌Plopping
           └Chikin Scream┤
                         │        
                         │        
                         └I Scream

Additional Notes

  • Behind the scenes, tex2py uses TexSoup. All tex2py objects have a source attribute containing a TexSoup object.
You can’t perform that action at this time.