Encode class names

Kenneth Hoste edited this page Jun 11, 2014 · 2 revisions
Clone this wiki locally

Since package names can be arbitrary and may begin with numberals (7zip) or contain characters that are not great as classnames (c++, C#), a function is needed to remap those names to more "safe" alternatives, in a way that maintains the following properties:

  • readability; the new name should be readable, as much as possible
  • reversibility & mapping: it should reflect well the original name
  • standardization: it should follow established practice, if any
  • extendibility: the logic should be extendable for future needs
  • nameclash-free: class names are prefixed with "EB_", by default

The function encode_class_name returns the encoded version of a class name and is in turned based on an EB_ prefix and the encode_string function.

For the programmer, this means that if, for example, you are extending something called C++, your class name likely is EB_C_plus__plus_.


package name class name
7zip EB_7zip
C# EB_C_hash_
Charm++ EB_Charm_plus__plus_
r EB_r
map\reduce EB_map_backslash_reduce

The encoding function can handle even more complicated package names, like:

  • name: 0_foo+0x0x#-$__
  • becomes: 0_underscore_foo_plus_0x0x_hash__minus__dollar__underscore__underscore_

It has been inspired by the concepts seen at: (but in lowercase style)

Determining the class name for a given package

>>> from easybuild.tools.filetools import encode_class_name
>>> encode_class_name('GAMESS-US')