Skip to content
amirroth edited this page Sep 22, 2021 · 8 revisions

Casing

C++ allows for arbitrary length identifier names, so there's no reason to be terse when naming variables. I'm looking at you EDIRSK, EINTSU, and the rest. You can use multiple full words in a name, what a concept. There are two common styles for separating words in a name:

  • CamelCase
  • snake_case

The C/C++ libraries use snake_case (think size_type or std::unordered_map) which has the advantage that it can also work with spell checkers. CamelCase is more common with Java. I personally prefer snake_case, but all of those underscores can really do a number on your right pinky! I have seen both in EnergyPlus.

Distinguishing Types from Variables

One aspect of a naming convention is distinguishing types from variables. There are a few options here:

  • One somewhat common convention is for types start with capital letters (e.g., MyClass) and variable (and function) names start with lower-case letters (e.g., myVariable).
    • [SM] Do we want to specify a preference for my_method over myMethod? The former is probably more common now. Qt and some libs use the latter. An exclusion when the case is vital to the meaning should be allowed.
  • Another convention I have seen is for types to start or end with a certain character group. For instance, all struct/class types either start with C_ or end with _c and all enumerated types start with T_ or end with _t (e.g., C_MyClass or MyClass_c).
  • Another possibility is to use the keywords struct, class, and enum whenever declaring a variable of the appropriate type (e.g., class MyClass object). C++ does not require this, but it can help with "self documentation". And sometimes syntax coloring too!

EnergyPlus does not currently follow any standard practice here, not even the lower-case/upper-case split.

Distinguishing Variables of Different Scopes

Another aspect of a naming convention is distinguishing variables of different scopes, i.e., local, global, object, and (since C++14) compile-time constants. EnergyPlus doesn't actually use global variables anymore (see this page for an explanation) so this is really just for local and object variables and compile time constants. Here are some possibilities:

  • One way to distinguish object variable names is to prepend references to them with this-> if the object name is not otherwise explicit. C++ does not require this because it has a precedence/shadowing rule for variable resolution and so it assumes (this->) if the variable is not declared locally, but this is a possibility.
  • Another mechanism for distinguishing object variable names is to prepend them with m_ (which is short for member).
    • [SM] Name decorating by storage type is needed to distinguish the data from its accessors but putting the decoration at the front of the name obscures readability. Suggest x for public data, x_ for private data, x() for accessors for readability.
  • One convention for distinguishing constants is by making them all-caps, e.g., constexpr int PI=3.14159265358979323.
    • [SM] That is more common for macros. Could be informative to readers on the other hand sometimes the constness is an implementation detail that shouldn't leak into the point of use. Also, it emphasizes the constants visually when they may be relatively unimportant: foo( arg, ROUTINENAME ) is not lovely.
    • [AR] This has been my experience as well.
  • Local variables (including function parameters, which are technically local to the function) are typically not decorated in any special way, and so are identified by this lack of decoration.

EnergyPlus does not follow any consistent practice here either!

Hungarian Notation

Back in the day (it was a Wednesday) there was this practice called Hungarian Notation (named after Microsoft Word developer and Chief Architect Charles Simonyi) that prepended every variable with a signature corresponding to its type, e.g., szName (sz is the signature for zero/null-terminated string) and arriZones (arri is the signature for array of int). I have seen a bit of this in EnergyPlus, specifically type names or type values starting with i (indicating int), but it is not pervasive. I don't believe Hungarian is used much these days and I don't think there is much value in using it in EnergyPlus.

Array (or Collection) Variables

A pervasive pattern in EnergyPlus is to give arrays and other collections singular names, e.g., EPVector<DataHeatBalance::ZoneData> Zone or EPVector<DataSurfaces::SurfaceData> Surface. A better practice is to use plural names or names that at least suggest that this is a collection of more than one object, e.g., EPVector<DataHeatBalance::ZoneData> zones or EPVector<DataSurfaces::SurfaceData> surfaceVec.

Actually there are a few other naming problems with these declarations. First, DataHeatBalance and DataSurfaces are not good namespace names, HBal and Surf are better (see this page for an explanation). Second, ZoneData and SurfaceData are not good type names. So, what you're telling me is that these classes contain ... data? Get out of here! Zone_c and Surface_c or even just Zone and Surface (with the struct modifier) is better. EPVector<HBal::Zone_c> zones; or EPVector<struct Surf::Surface> surfaceVec;. Isn't that better?

Loop Induction Variables

The canonical loop induction variable is i. If you have nested loops, the convention is to use j, k, etc. for the second-level loop, third-level loop, and so on. i, j, k, provide some level of documentation by at least telling you what loop nesting level you are in, but not much else. It's better to give the induction variable a logically meaningful name like surfNum or spaceIdx. An no, Loop is not a logically meaningful name. So, what you're telling me is ... this is a loop? Stop it! Where?

The whole surfNum thing brings up the question of what is the right convention for the array index of object X as well as for the total number of X in the array. EnergyPlus actually uses a mix of styles:

  • SurfNum and TotSurfaces. Hmmmmm.
  • ZoneNum and NumOfZones. The use of Of in variable names really irks me especially if other parts of the variable name are contracted. Feels very "penny-wise, pound-foolish" if you know what I mean.
  • spaceIndex and numSpaces. This one is my favorite although I would have preferred idxSpace.
  • There is also frequent use of Ptr as in SchedPtr which is strange because we don't really reference objects by pointer in EnergyPlus.

I think I have made my preference clear here.

  • Xs for an array of X.
  • idxX for an index into Xs.
  • numXs for the number of X in Xs.
  • XIndices for an array of indices into Xs.
  • idxXIndices for an index into an array of indices.
  • And so on.

YMMV.

Clone this wiki locally