Skip to content

mjambon/vanity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vanity CircleCI

Introduction

What is intuition? What are emotions? Most people are confused by those terms, and typically treat them as taboos. Not only won't they try to seriously define them, but they will also treat attempts to do so as some kind of heresy.

However, some of us actively work toward the mechanization of cognition. We assign specific, non-mystical meanings to these terms. At least to some of them. It's often better to reuse a term like "imagination" and give it a precise sense, rather than making up a new word that doesn't evoke anything.

Disambiguating and communicating the precise meaning of technical terms is the goal of this project. It offers three things:

  1. Recommendations for defining technical terms, and in particular the avoidance of mutually-recursive definitions.
  2. A machine-readable, human-readable format for structuring glossaries.
  3. A command-line program called vanity for formatting a glossary into an HTML document or a graph of terms.

Installation

Download a statically-linked executable for your platform.

If your platform is not in there or if you want to try a development version, you'll have to build it from source. We have some instructions in DEV.md. After installing the prerequisites, you can build and install vanity for your Unixy platform using make && make install.

Example

The following glossary was formatted by vanity from examples/vanity.yml and inserted into this readme:

term: a word or sequence of words whose meaning depends on the context.

definition: a textual description of the meaning or meanings of a term.

technical term: a term with a fixed meaning within a field of study. Such terms may be formatted differently than ordinary terms to indicate that they should be interpreted as a technical term.

dictionary: a set of terms and their definitions.

glossary: a dictionary of technical terms.

graph: a graph in the sense of graph theory, that is a set of nodes and a set of edges connecting nodes.

DAG: DAG is the usual abbreviation for a directed acyclic graph in graph theory. Each edge is an arrow connecting two nodes. In a DAG, following the edges starting from any node guarantees that eventually we'll reach a node that has no outgoing edges.

constructive glossary: a glossary which forms a DAG of technical terms. For each technical term, there is a unique node A. For each reference to a technical term B in the definition of A, there is an edge AB in the DAG.

The graph was exported by vanity to the dot format, and then dot turned it into this image:

DAG

The source yaml for this glossary is:

# Glossary of terms used by the vanity project.
---
- term: term
  def: >
    a word or sequence of words whose meaning depends on the context.
  syn:
    - terms
- term: definition
  def: >
    a textual description of the meaning or meanings of a [term].
  syn:
    - definitions
- term: technical term
  def: >
    a [term] with a fixed meaning within a field of study. Such [terms] may be
    formatted differently than ordinary [terms] to indicate that they should
    be interpreted as a [technical term].
  syn:
    - technical terms
- term: dictionary
  def: >
    a set of [terms] and their [definitions].
- term: glossary
  def: >
    a [dictionary] of [technical terms].
  syn:
    - glossaries
- term: graph
  def: >
    a graph in the sense of graph theory, that is a set of nodes and
    a set of edges connecting nodes.
- term: DAG
  def: >
    [DAG] is the usual abbreviation for a directed acyclic [graph] in graph
    theory. Each edge is an arrow connecting two nodes. In a [DAG], following
    the edges starting from any node guarantees that eventually we'll reach
    a node that has no outgoing edges.
- term: constructive glossary
  def: >
    a [glossary] which forms a [DAG] of [technical terms].
    For each [technical term], there is a unique node A. For each
    reference to a [technical term] B in the [definition] of A,
    there is an edge AB in the [DAG].

The commands for producing these results can be found in examples/Makefile.

Documentation

Once installed, check out the output of vanity --help. The following output formats are supported:

  • HTML snippet or standalone page
  • graph in the dot format understood by Graphviz

Input format reference

A valid input document is a yaml document, consisting of an ordered list (yaml array) of definitions. Each definition has 2 mandatory fields term and def, and an optional field syn.

  • term: string that represents the standard form of the term being defined.
  • def: string that holds the definition for the term. Links to other terms are placed within square brackets, such as in The sky is [cloudy] today.. Only links to previous definitions or to the current definition are permitted. The text of the link must be a term from a term field or one of its synonyms from a syn field.
  • syn: array of strings that are considered synonyms with the term being defined, in the sense that any reference to a synonym will link to the standard term. This can be used to hold different variations of a word, such as plural forms, gendered forms, plain synonyms, conjugated forms of verbs, etc. (this will become annoying for some languages other than English)

The definitions must be sorted such that all links refer to terms that were defined earlier. For example, the following is valid input:

- term: potato
  def: the edible tuber from the potato plant
- term: French fries
  def: deep-fried [potato] chunks

We can add potatoes as a synonym of potato and link to potatoes instead of potato:

- term: potato
  def: the edible tuber from the potato plant
  syn:
    - potatoes
- term: French fries
  def: deep-fried [potatoes]

Order matters. The following is illegal:

- term: French fries
  def: deep-fried [potato] chunks
- term: potato
  def: the edible tuber from the potato plant

In that case we get an error message:

$ vanity < glossary.yml > glossary.html
error: definition for term 'French fries' uses undefined term: 'potato'.

Implementation

vanity is implemented as a command-line program that reads a dictionary in source form, checks its validity, and produces a readable document.

The input is a list of term definitions. The yaml syntax was chosen as it accommodates text better than json and is more readable than XML. Structured data such as lists of synonyms can easily be added without extending the syntax. The only originality is in the markup language used in the body of the definitions, which uses its own conventions to link terms to their definition.

The Go language was chosen for this implementation as it's relatively friendly to external contributors, and it was a good opportunity for me author to learn it. The ability to distribute binaries with no runtime dependencies and for various platforms is also a big plus.

Ideas for future contributions

Public awareness:

  • Write an introductory article explaining why and when this thing can be useful.
  • Use the tool and publish reports on its usefulness and lessons learned.

Maintenance and distribution:

  • Add automatic testing using one of Travis, CircleCI, Github Actions, etc.
  • Add contribution guidelines (highly recommended to do before accepting contributions).

User-facing features:

  • Document the input format.
  • Produce a graph even if it has cycles, as an aid to see what's going on.
  • Use topological sorting to implement some of the following features:
    • Rearrange the input document in an order compatible with the dependencies. This is a conversion from yaml to yaml.
    • Automatically sort the input document topologically so that the author doesn't have to. This is only for checking purposes. This doesn't produce new input or different output.
    • Sort the definitions in the output document in dependency order or reverse dependency order, depending on user preference.
  • Add an option to sort the terms alphabetically.
  • Add support for multiple senses via some dedicated syntax. It could be something like something_2 where the term is identified by the full string something_2 but rendered as just something or something (2), and links to the correct definition.
  • Offer out-of-the-box option for showing definition preview on hover or single-tap on mobile. This would work like the Wikipedia mobile app or Wikiwand.
  • Export to PDF or whichever format is in demand. Pandoc is an excellent tool for this. Part of the work would consist in making the original output of vanity fully understood by pandoc. Perhaps the best format for this isn't HTML but some other language best suited for pandoc input.

Project status

I won't develop the vanity tool further because it works well enough for me, the original author (@mjambon). The project is up for adoption if anyone is interested in adding features, reviewing suggestions, pull requests, etc.