Open Mensa Feed v2 Generators
2015-2020, Georg Sauthoff mail@georg.so
This repository contains a few programs for generating menus in the Open Mensa Feed v2 format:
-
fra2openmensa.py - Frankfurt University Mensa page parser, also supports other Mensas of the operator
-
unibi2openmensa.py - Bielefeld University Mensa page parser, also supports other Mensas of the operator
-
fhrus2openmena.cc - FH-Rüsselsheim Mensa page parser, also supports several other Mensas of the Studentenwerk Frankfurt (outdated)
-
unibi2openmensa.cc - Bielefeld University Mensa page parser (outdated)
For converting the HTML pages into the XML feed, they are first cleaned up into well-formed and conforming XHTML via HTML Tidy.
The generators are written in C++11 and heavily use XPath expressions.
Python is used for the update and test scripts instead of Shell, because of the library support. In that sense, Python is used as a replacement for shell scripting.
The second generation is written in Python.
Main motivating factor for choosing Python over a C++ solution: The html5lib Python package - which does a very good job normalizing real-world html into an XML tree. It certainly does a better job than libxml2 (via its html API) and HTML tidy. Also, in contrast to HTML tidy, it is actively maintained and obviously easier to integrate than an external program.
Besides that, the rich Python standard library doesn't leave much to be desired: high-level string processing, XML tree construction, XPath all is available without extra dependencies.