Best practices for working with classes

Aaron Meurer edited this page Aug 16, 2017 · 7 revisions

Here are some best practices for working with classes. The rules here should be followed in the SymPy codebase, but they apply generally to Python classes in general.

Best practices for working with classes

  • If your code will be run in Python 2, every class should subclass from object.

    That is, avoid class A: without declaring a subclass. If your code will not run in Python 2 (Python 3-only), it is not necessary to subclass from object as this is done automatically, and in fact it should be avoided. For SymPy, we support Python 2 (but note that almost every SymPy class will actually subclass from a SymPy superclass such as Basic, Expr, or Function.

    Why? In Python 2, a class without a predefined superclass is declared as an "old-style" class. A class that subclasses from object is a new-style class. Old-style classes do not support many of the features of new-style classes, such as descriptors, __slots__. They also have different behavior when used with multiple inheritance. Furthermore, since old-style classes have been removed in Python 3, if your code is run in both Python 2 and Python 3, it may behave differently if the class is old-style in Python 2 and new-style in Python 3.

  • Always name the first argument of an instance method self and the first argument of a classmethod (@classmethod) cls.

    class A(object):
        def method(self, arg):
            pass
        @classmethod
        def constructor(cls, arg):
            pass

    Note that __new__ is a class method and should use cls.

    For metaclasses, instance methods should use cls and class methods should use mcl. If you don't know what a metaclass is don't worry about this.

    Why? Python differs from most object oriented languages in that the self parameter for referring to the current instance is explicitly passed in as the first argument of a method. This means that technically, you can name the first argument of a method anything. The names self and cls are convention only. In other words, the following are equivalent

    class Person(object):
        def set_age(self, age):
            self.age = age
    class Person(object):
        # DON'T DO THIS
        def set_age(whatever, age):
            whatever.age = age

    However, conventions are important, because they help people who read your code (including you) to easily understand what it is doing. By always using self, it is clear that whenever self is used it refers to the instance, and whenever you want to refer to the instance, you should use self.

    Using cls for class methods is even more important. If you instead used self, this would suggest that it is an instance, but it isn't! The separate name cls makes it clear that the argument is a class, not an instance.

  • Always use isinstance when checking for a class.

    PEP 8 has this to say about isinstance:

    Object type comparisons should always use isinstance() instead of comparing types directly.

    Yes: if isinstance(obj, int):

    No: if type(obj) is type(1):

    Comparing type(obj) or obj.__class__ against a target class is wrong. issubclass(type(obj), int) is also wrong.

    Also be aware that such type checks are often themselves a code smell for two reasons:

    • Python is duck-typed, meaning that if an object looks like what you expect it to look like, it generally should just work. For instance, there is no need to check

      if isinstance(i, int):
          some_list[i]

      because some_list[i] will raise the appropiate TypeError if i is not a valid index, but on the other hand, it is possible to define custom objects that do not subclass from int that can be used as an index (such as SymPy's Integer).

      This rule is not steadfast, as sometimes a class is so large that the easiest way to assume that any object adheres to its full interface is by requiring it to subclass a given class. In SymPy, every object that is used as part of an expression tree should subclass from sympy.Basic.

    • Any code testing isinstance(self, ...) is usually wrong. The code for handling the logic of a given class should generally be in the class itself (see below).

    Why? isinstance(obj, A) differs from type(obj) is A in a very fundamental way: if the class of obj is a subclass of A, then isinstance will return True, but the type check will return False.

    This is important because subclassing is the primary way that people can extend the functionality of object oriented code that they do not control.

    If you have a method that check

    if type(obj) is A:
        ...

    It will be impossible for someone to extend A with their own behavior via a subclass class B(A) and use it in your code, because your code will only work for exact instance of A. Note that a subclass, to be correct, should always work in the same places where the original class would work (this is the Liskov substitution principle).

    If you take away one thing from this guide, it is this:

    Always write class code in such a way that things work with potential subclasses.

    This includes both subclasses of your own classes and subclasses of other classes that you reference.

  • Don't put logic for a subclass in a superclass.

  • Refer to the name of a class implicitly.

    That is, if you have a class A, don't use the variable A inside any method of A. Instead refer to self.__class__ (or cls for a class method).

  • Don't reassign self.

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.