Python-based command line tool for compiling the proceedings of the Chicago Linguistic Society (CLS)
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 13 commits behind jacksonllee:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
.gitignore is a tool for compiling the proceedings of the Chicago Linguistic Society.

When the individual paper PDFs and other necessary PDFs (front matter, acknowledgments, etc) are in place, this tool compiles the proceedings PDF output, automatically taking care of the following:

  • figuring out the page numbers for individual papers
  • generating the paper headers and adding them to the paper PDFs
  • generating the table of contents
  • concatenating everything to create the final proceedings PDF output

This tool is based essentially on what was used to compile the CLS 48 volume, plus a few upgrades. It has been used to compile the CLS 50 and CLS 51 volumes.

Download is currently available on GitHub:

Two ways of downloading it:


This repository contains the following:

  • where all the magic happens
  • example: a folder as a sample "working directory" where all necessary input files are found
  • this readme file you are reading
  • the CLS 51 author kit with the CLS stylesheet and templates; included here for reference

System requirements

  1. A Unix-like environment is a command line tool out of the box. As of November 2016, all use cases of have been on Unix-like environments only (Linux and Mac OS). Windows is not actively supported. (Cygwin and the like should work in principle, but not tested.)

  2. Python

    Python is required to run If you are on Linux or Mac OS, Python is readily available. Either Python 2 or 3 works with

    Throughout this readme document, we use python to generically mean the Python command for your terminal.

  3. The Python package PyPDF2 (

    We need this package to manipulate PDF files in Python:

    $ python -m pip install PyPDF2

    Administrative privileges (e.g. sudo on Ubuntu) may be required.

    If Python complains that pip is unavailable, you'll need to get it first. See here. Alternatively, PyPDF2 can be installed through its source on PyPI or GitHub.

  4. The pdflatex program

    The pdflatex command has to be available in your path so that we can call it to compile LaTeX documents. If you are on Linux or Mac and TeX Live is installed, then you probably have pdflatex already.

Requirements for the input PDF files

All the PDFs needed to compile the CLS proceedings must all be in a working directory -- let's call this directory example (as shown in this repository).

Inside example, all PDFs must be organized in the following way in order for to work:

        <all individual paper PDFs without headers or paper numbers>

Four folders are expected inside example; a fifth required item right under example---organizer.csv---is explained in the next section.

  • front-matter

    Only one PDF file is expected in this folder, e.g., front-matter.pdf (.tex template also provided). ignores all other non-PDF files in this folder, so you may work within this folder with LaTeX files etc to generate the required PDF file.

  • acknowledgments

    Only one PDF file is expected in this folder, e.g., acknowledgments.pdf (.tex template also provided). ignores all other non-PDF files in this folder, so you may work within this folder with LaTeX files etc to generate the required PDF file.

  • papers-without-headers

    This folder contains all PDFs of the individual papers without headers or page numbers.

  • templates

    This folder contains templates and other files needed. Do NOT change their names, though the contents of headers.tex and table-of-contents.tex can be updated if necessary. blank.pdf is the blank PDF page inserted here and there in the final proceedings PDF so that all items start on the right-hand side in the printed volume.

The organizer CSV file

An organizer CSV file (e.g., organizer.csv) is required to provide the essential information about authors, paper titles, etc. This CSV file must contain six columns with the following header names:

  1. index

    An index number just for convenience. The first paper is 1 and so forth. Note that the order of the rows in this CSV file (regardless of what the column index says) determines the order by which the papers appear in the proceedings PDF. So make sure all the papers are ordered correctly in the organizer by alphabetical order of first authors' last names or whatever the CLS members would like.

  2. authors

    The cell for each paper in the column authors shows exactly how author names appear in the table of contents. Use LaTeX formatting for non-ASCII characters (accented characters etc).

  3. paper title

    (similar to authors above)

  4. authors in header

    The cell for each paper in the column authors in header shows exactly how author names appear in the paper's header. Use LaTeX formatting for non-ASCII characters (accented characters etc). The cell content will be forced to be in the uppercase in the output proceedings PDF.

    If this cell is empty, then the cell content from authors for the paper in question will be used.

    Note the cell content of authors in header cannot exceed a certain character length (controlled by the optional argument --maxheaderlength for; more on running below) because the header naturally cannot accommodate something too long that would go over one line or cover up the page number. So for a paper with a long chain of author names, you have to put down something much shorter here (e.g., only the lastnames?).

  5. paper title in header

    (similar to authors in header above)

  6. paper filename

    The cell for each paper in the column paper filename shows the paper's filename (e.g., smith.pdf) as it appears in the folder papers-without-headers.

How to run

If you have a working directory like example with all necessary files properly organized by the guidelines here, then do this at your current directory where is:

$ python --directory=<relative-path-to-your-working-directory>

If you don't provide --directory=<relative-path-to-your-working-directory> (i.e., if you run python without any arguments), assumes the example folder is at the current directory and it is your working directory. allows various optional arguments for changing file/folder names etc. Please run python -h for details. Among the array of optional parameters, you may be interested in the following:

  • --maxheaderlength

    The maximum length (by number of characters) of the author or paper title headers in the paper PDFs (default: 55). This cap ensures that the header does not go over one line or cover up the page number in the header. To change the value to, say, 60, do something like python --maxheaderlength=60.

  • --startpagenumber

    The starting page number of the first paper by order in the proceedings volume (default: 1).

Multiple optional parameters are possible, in the form of python --<parametername1>=<parametervalue1> --<parametername2>=<parametervalue2>.


If you run and all goes well, the final proceedings PDF output should be sitting right inside your working directory (hurray!). In addition, all intermediate files are kept for reference. Inside the working directory, you should see the new folders table-of-contents, headers, and papers-with-headers (all individual paper PDFs nicely typeset with headers and page numbers here!).

Note that if you are running multiple times, all already-existing contents inside the folders table-of-contents, headers, and papers-with-headers will be removed at each run to ensure clean output files.

Upon completion of the final PDF compilation, three log files are generated at the working directory: master.log, pdflatex.log, and directory.log.

Technical support etc.

CLS officers are welcome to contact Jackson Lee for any questions regarding this tool. If you run into any issues and would like Jackson's help for troubleshooting, doing the following will help him to figure out how to help you:

  • Tell him what error messages (if any) appear on the terminal.
  • Send him the three log files (master.log, pdflatex.log, and directory.log).

To take advantage of the GitHub infrastructure -- Code contributions through pull requests are more than welcome! Questions and bug reports? Please submit a ticket here.

Dev notes

The overarching strategies of

  • Check if everything needed is in place before any PDF manipulation is done
  • Issue an error (and exit) as soon as one is detected

Everything in the proceedings volume that comes after the table of contents is treated as "papers". This means that, for instance, if something like a prompt page to introduce the main session or parasession papers is desired, it should be treated as a "paper" and included in the organizer CSV file (but we probably don't want headers and page numbers for these -- would need some way to handle this).