In [None]:
PEP 8 and naming best practices
PEP 8 (http://www.python.org/dev/peps/pep-0008) provides a style guide for writing Python code. Besides some basic rules, such as indentation, maximum line length, and other details concerning the code layout, PEP 8 also provides a section on naming conventions that most of the code bases follow.

This section provides only a quick summary of PEP 8, and a handy naming guide for each kind of Python syntax element.
You should still consider reading the PEP 8 document as mandatory.

In [None]:
Why and when to follow PEP 8?
If you are creating a new software package that is intended to be open sourced, you should always follow PEP 8 because it is a widely accepted standard and is used in most of the open source projects written in Python. If you want to foster any collaboration with other programmers, then you should definitely stick to PEP 8, even if you have different views on the best code style guidelines. Doing so has the benefit of making it a lot easier for other developers to jump straight into your project. Code will be easier to read for newcomers because it will be consistent in style with most of the other Python open source packages.

Also, starting with full PEP 8 compliance saves you time and trouble in the future. If you want to release your code to the public, you will eventually face suggestions from fellow programmers to switch to PEP 8. Arguments as to whether it is really necessary to do so for a particular project tend to be never-ending flame wars that are impossible to win. This is the sad truth, but you may be eventually forced to be consistent with it or risk losing valuable contributors.

Also, restyling of the whole project's code base if it is in a mature state of development might require a tremendous amount of work. In some cases, such restyling might require changing almost every line of code. While most of the changes can be automated (indentation, newlines, and trailing whitespaces), such massive code overhaul usually introduces a lot of conflicts in every version control workflow that is based on branching. It is also very hard to review so many changes at once. These are the reasons why many open source projects have a rule that style-fixing changes should always be included in separate pull/merge requests or patches that do not affect any feature or bug.

In [None]:
Beyond PEP 8 – Team-specific style guidelines
Despite providing a comprehensive set of style guidelines, PEP 8 still leaves some freedom for the developers. Especially in terms of nested data literals and multiline function calls that require long lists of arguments. Some teams may decide that they require additional styling rules and the best option is to formalize them in some kind of document that is available for every team member.

Also, in some situations, it may be impossible or economically infeasible to be strictly consistent with PEP 8 in some old projects that had no style guide defined. Such projects will still benefit from formalization of the actual coding conventions even if they do not reflect the official set of PEP 8 rules. Remember, what is more important than consistency with PEP 8 is consistency within the project. If rules are formalized and available as a reference for every programmer, then it is way easier to keep consistency within a project and organization.

Let's take a look at the different naming styles in the next section.

In [None]:
Naming styles
The different naming styles used in Python are:

CamelCase
mixedCase
UPPERCASE and UPPER_CASE_WITH_UNDERSCORES
lowercase and lower_case_with_underscores
_leading and trailing_ underscores, and sometimes __doubled__ underscores
Lowercase and uppercase elements are often a single word, and sometimes a few words concatenated. With underscores, they are usually abbreviated phrases. Using a single word is better. The leading and trailing underscores are used to mark the privacy and special elements.

These styles are applied to the following:

Variables
Functions and methods
Properties
Classes
Modules
Packages

In [None]:
Variables
There are the following two kinds of variables in Python:

Constants: These define values that are not supposed to change during program execution
Public and private variables: These hold the state of applications that can change during program execution

In [None]:
Constants
For constant global variables, an uppercase with an underscore is used. It informs the developer that the given variable represents a constant value.

There are no real constants in Python like those in C++, where const can be used. You can change the value of any variable. That's why Python uses a naming convention to mark a variable as a constant.
For example, the doctest module provides a list of option flags and directives (http://docs.python.org/lib/doctest-options.html) that are small sentences, clearly defining what each option is intended for, for example:

from doctest import IGNORE_EXCEPTION_DETAIL 
from doctest import REPORT_ONLY_FIRST_FAILURE 
These variable names seem rather long, but it is important to clearly describe them. Their usage is mostly located in the initialization code rather than in the body of the code itself, so this verbosity is not annoying.

Abbreviated names obfuscate the code most of the time. Don't be afraid of using complete words when an abbreviation seems unclear.
Some constants' names are also driven by the underlying technology. For instance, the os module uses some constants that are defined on the C side, such as the EX_XXX series, that defines UNIX exit code numbers. Same name code can be found, as in the following example, in the system's sysexits.h C headers files:

import os
import sys
sys.exit(os.EX_SOFTWARE)
Another good practice when using constants is to gather all of them at the top of a module that uses them. It is also common to combine them under new variables if they are flags or enumerations that allow for such operations, for example:

import doctest 
TEST_OPTIONS = (doctest.ELLIPSIS | 
                doctest.NORMALIZE_WHITESPACE |  
                doctest.REPORT_ONLY_FIRST_FAILURE) 
Let's take a look at the naming and usage of constants in the next section. 

In [None]:
Naming and usage
Constants are used to define a set of values the program relies on, such as the default configuration filename.

A good practice is to gather all the constants in a single file in the package. That is how Django works, for instance. A module named settings.py provides all the constants as follows:

# config.py 
SQL_USER = 'tarek' 
SQL_PASSWORD = 'secret' 
SQL_URI = 'postgres://%s:%s@localhost/db' % ( 
    SQL_USER, SQL_PASSWORD 
) 
MAX_THREADS = 4 
Another approach is to use a configuration file that can be parsed with the ConfigParser module, or another configuration parsing tool. But some people argue that it is rather an overkill to use another file format in a language such as Python, where a source file can be edited and changed as easily as a text file.

For options that act like flags, a common practice is to combine them with Boolean operations, as the doctest and re modules do. The pattern taken from doctest is quite simple, as shown in the following code:

OPTIONS = {}

def register_option(name):
    return OPTIONS.setdefault(name, 1 << len(OPTIONS))

def has_option(options, name):
 return bool(options & name)

# now defining options
BLUE = register_option('BLUE')
RED = register_option('RED')
WHITE = register_option('WHITE')
This code allows for the following usage:

>>> # let's try them 
>>> SET = BLUE | RED 
>>> has_option(SET, BLUE) 
True 
>>> has_option(SET, WHITE) 
False
When you define a new set of constants, avoid using a common prefix for them, unless the module has several independent sets of options. The module name itself is a common prefix.

Another good solution for option-like constants would be to use the Enum class from the built-in enum module and simply rely on the set collection instead of the binary operators. Details of the enum module usage and syntax were explained in the Symbolic enumeration with enum module section of Chapter 3, Modern Syntax Elements - Below the Class Level.

Using binary bit-wise operations to combine options is common in Python. The inclusive OR (|) operator will let you combine several options in a single integer, and the AND (&) operator will let you check that the option is present in the integer (refer to the has_option function).
Let's discuss public and private variables in the following section.

In [None]:
Public and private variables
For global variables that are mutable and freely available through imports, a lowercase letter with an underscore should be used when they do not need to be protected. If a variable shouldn't be used and modified outside of its origin module we consider it a private member of that module. A leading underscore, in that case, can mark the variable as a private element of the package, as shown in the following code:

_observers = []
def add_observer(observer):
    _observers.append(observer)
def get_observers():
    """Makes sure _observers cannot be modified."""
    return tuple(_observers)
Variables that are located in functions, and methods, follow the same rules as public variables and are never marked as private since they are local to the function context.

For class or instance variables, you should use the private marker (the leading underscore) if making the variable a part of the public signature does not bring any useful information, or is redundant. In other words, if the variable is used only internally for the purpose of some other method that provides an actual public feature, it is better to make it private.

For instance, the attributes that are powering a property are good private citizens, as shown in the following code:

class Citizen(object):
    def __init__(self, first_name, last_name):
        self._first_name = first_name
        self._last_name = last_name

    @property
    def full_name(self):
        return f"{self._first_name} {self._last_name}"
Another example would be a variable that keeps some internal state that should not be disclosed to other classes. This value is not useful for the rest of the code, but participates in the behavior of the class:

class UnforgivingElephant(object):
    def __init__(self, name):
        self.name = name
        self._people_to_stomp_on = []

    def get_slapped_by(self, name):
        self._people_to_stomp_on.append(name)
        print('Ouch!')

    def revenge(self):
        print('10 years later...')
        for person in self._people_to_stomp_on:
            print('%s stomps on %s' % (self.name, person))
Here is what you'll see in an interactive session:

>>> joe = UnforgivingElephant('Joe') 
>>> joe.get_slapped_by('Tarek') 
Ouch! 
>>> joe.get_slapped_by('Bill') 
Ouch! 
>>> joe.revenge() 
10 years later... 
Joe stomps on Tarek 
Joe stomps on Bill 
Let's take a look at naming styles for functions and methods in the next section.

In [None]:
Functions and methods
Functions and methods should be in lowercase with underscores. This rule was not always true in the old standard library modules. Python 3 did a lot of reorganization of the standard library, so most of the functions and methods have a consistent letter case. Still, for some modules such as threading, you can access the old function names that used mixedCase (for example, currentThread). This was left to allow easier backward compatibility, but if you don't need to run your code in older versions of Python, then you should avoid using these old names.

This way of writing methods was common before the lowercase norm became the standard, and some frameworks, such as Zope and Twisted, are also still using mixedCase for methods. The community of developers working with them is still quite large. So the choice between mixedCase and lowercase with an underscore is definitely driven by the libraries you are using.

As a Zope developer, it is not easy to stay consistent because building an application that mixes pure Python modules and modules that import Zope code is difficult. In Zope, some classes mix both conventions because the code base is still evolving and Zope developers try to adopt the common conventions accepted by so many.

A decent practice in this kind of library environment is to use mixedCase only for elements that are exposed in the framework, and to keep the rest of the code in PEP 8 style.

It is also worth noting that developers of the Twisted project took a completely different approach to this problem. The Twisted project, same as Zope, predates the PEP 8 document. It was started when there were no official guidelines for Python code style, so it had its own guidelines. Stylistic rules about the indentation, docstrings, line lengths, and so on could be easily adopted. On the other hand, updating all the code to match naming conventions from PEP 8 would result in completely broken backward compatibility. And doing that for such a large project as Twisted is infeasible. So Twisted adopted as much of PEP 8 as possible and left things such as mixedCase for variables, functions, and methods as part of its own coding standard. And this is completely compatible with the PEP 8 suggestion because it exactly says that consistency within a project is more important than consistency with PEP 8's style guide.

In [None]:
The private controversy
For private methods and functions, we usually use a single leading underscore. This is only a naming convention and has no syntactical meaning. But it doesn't mean that leading underscores have no syntactical meaning at all. When a method has two leading underscores, it is renamed on the fly by the interpreter to prevent a name collision with a method from any subclass. This feature of Python is called name mangling.

So some people tend to use a double leading underscore for their private attributes to avoid name collision in the subclasses, for example:

class Base(object):
    def __secret(self):
        print("don't tell")

    def public(self):
        self.__secret()

class Derived(Base):
    def __secret(self):
        print("never ever")
From this you will see the following output:

>>> Base.__secret
Traceback (most recent call last):
  File "<input>", line 1, in <module>
AttributeError: type object 'Base' has no attribute '__secret'
>>> dir(Base)
['_Base__secret', ..., 'public']
>>> Base().public()
don't tell
>>> Derived().public()
don't tell
The original motivation for name mangling in Python was not to provide the same isolation primitive as a private keyword in C++ but to make sure that some base classes implicitly avoid collisions in subclasses, especially if they are intended to be used in multiple inheritance contexts (for example, as mixin classes). But using it for every attribute that isn't public obfuscates the code and makes it extremely hard to extend. This is not Pythonic at all.

For more information on this topic, an interesting thread occurred in the Python-Dev mailing list many years ago, where people argued on the utility of name mangling and its fate in the language. It can be found at http://mail.python.org/pipermail/python-dev/2005-December/058555.html.

Let's take a look at the naming styles for special methods.

In [None]:
Special methods
Special methods (https://docs.python.org/3/reference/datamodel.html#special-method-names) start and end with a double underscore and form so-called protocols of the language (see Chapter 4, Modern Syntax Elements - Above the Class Level). Some developers used to call them dunder methods as a portmanteau of double underscore. They are used for operator overloading, container definitions, and so on. For the sake of readability, they should be gathered at the beginning of class definitions, as shown in the following code:

class WeirdInt(int):
    def __add__(self, other):
        return int.__add__(self, other) + 1

    def __repr__(self):
        return '<weirdo %d>' % self

    # public API
    def do_this(self):
        print('this')

    def do_that(self):
        print('that')
No user-defined method should use this convention unless it explicitly has to implement one of the Python object protocols. So don't invent your own dunder methods such as this:

class BadHabits: 
    def __my_method__(self): 
        print('ok') 
Let's discuss the naming styles for arguments in the next section.

In [None]:
Arguments
Arguments are in lowercase, with underscores if needed. They follow the same naming rules as variables because arguments are simply local variables that get their value as function input values. In the following example, text and separator are arguments of one_line() function:

def one_line(text, separator=" "):
    """Convert possibly multiline text to single line"""
    return separator.join(text.split())
The naming style of properties is discussed in the next section.

In [None]:
Properties
The names of properties are in lowercase, or in lowercase with underscores. Most of the time they represent an object's state, which can be a noun or an adjective, or a small phrase when needed. In the following code example, the Container class is a simple data structure that can return copies of its contents through unique_items and ordered_items properties:

class Container:
    _contents = []

    def append(self, item):
        self._contents.append(item)

    @property
    def unique_items(self):
        return set(self._contents)

    @property
    def ordered_items(self):
        return list(self._contents)
Let's take a look at the naming styles used for classes.

In [None]:
Classes
The names of classes are always in CamelCase, and may have a leading underscore when they are private to a module.

In object-oriented programming classes are used to encapsulate the application state. Attributes of objects are record of that state. Methods are used to modify that state, convert it into meaningful values or to produce side effects. This is why class names are often noun phrases and form a usage logic with the method names that are verb phrases. The following code example contains a Document class definition with a single save() method:

class Document():
    file_name: str
    contents: str
    ...    

    def save(self):
        with open(self.file_name, 'w') as file:
            file.write(self.contents)
Class instances often use the same noun phrases as the document but spelled with lowercase. So, actual Document class usage could be as follows:

new_document = Document()
new_document.save()
Let's go through the naming styles for modules and packages.

In [None]:
Modules and packages
Besides the special module __init__, the module names are in lowercase. The following are some examples from the standard library:

os
sys
shutil
The Python standard library does not use underscores for module names to separate words but they are used commonly in many other projects. When the module is private to the package, a leading underscore is added. Compiled C or C++ modules are usually named with an underscore and imported in pure Python modules. Package names follow the same rules, since they act more like structured modules.

Let's discuss the naming guide in the next section.

In [None]:
The naming guide
A common set of naming rules can be applied on variables, methods, functions, and properties. The names of classes and modules play a very important role in namespace construction and greatly affect code readability. This section contains a miniguide that will help you to define meaningful and readable names for your code elements.

In [None]:
Using the has/is prefixes for Boolean elements
When an element holds a Boolean value you can mark it with is and/or has syntax to make the variable more readable. In the following example, is_connected and has_cache are such identifiers that hold Boolean states of the DB class instances:

class DB: 
    is_connected = False 
    has_cache = False

In [None]:
Using plurals for variables that are collections
When an element is holding a sequence, it is a good idea to use a plural form. You can also do the same for various mapping variables and properties. In following example, connected_users and tables are class attributes that hold multiple values:

class DB: 
    connected_users = ['Tarek'] 
    tables = {'Customer':['id', 'first_name', 'last_name']} 

In [None]:
Using explicit names for dictionaries
When a variable holds a mapping, you should use an explicit name when possible. For example, if a dict holds a person's address, it can be named persons_addresses:

persons_addresses = {'Bill': '6565 Monty Road',  
                     'Pamela': '45 Python street'}

In [None]:
Avoid generic names and redundancy
You should generally avoid using explicit type names list, dict, and set as parts of variable names even for local variables. Python now offers function and variable annotations and a typing hierarchy that allows you to easily mark an expected type for a given variable so there is no longer a need to describe object types in their names. It makes the code hard to read, understand, and use. Using a built-in name has to be avoided as well to avoid shadowing it in the current namespace. Generic verbs should also be avoided, unless they have a meaning in the namespace.

Instead, domain-specific terms should be used as follows:

def compute(data):  # too generic
    for element in data:
        yield element ** 2

def squares(numbers):  # better
    for number in numbers:
        yield number ** 2
There is also the following list of prefixes and suffixes that, despite being very common in programming, should be, in fact, avoided in function and class names:

Manager
Object
Do, handle, or perform
The reason for this is that they are vague, ambiguous, and do not add any value to the actual name. Jeff Atwood, the co-founder of Discourse and Stack Overflow, has a very good article on this topic and it can be found on his blog at http://blog.codinghorror.com/i-shall-call-it-somethingmanager/

There is also a list of package names that should be avoided. Everything that does not give any clue about its content can do a lot of harm to the project in the long term. Names such as misc, tools, utils, common, or core have a very strong tendency to become endless bags of various unrelated code pieces of very poor quality that seem to grow in size exponentially. In most cases, the existence of such a module is a sign of laziness or lack of enough design efforts. Enthusiasts of such module names can simply forestall the future and rename them to trash or dumpster because this is exactly how their teammates will eventually treat such modules.

In most cases, it is almost always better to have more small modules even with very little content but with names that reflect well what is inside. To be honest, there is nothing inherently wrong with names such as utils and common and there is a possibility to use them responsibly. But reality shows that in many cases they instead become a stub for dangerous structural antipatterns that proliferate very fast. And if you don't act fast enough, you may not be able get rid of them ever. So the best approach is simply to avoid such risky organizational patterns and nip them in the bud.

In [None]:
Avoiding existing names
It is a bad practice to use names that shadow other names that already exist in the same context. It makes code reading and debugging very confusing. Always try to define original names, even if they are local to the context. If you eventually have to reuse existing names or keywords, use a trailing underscore to avoid name collision, for example:

def xapian_query(terms, or_=True): 
    """if or_ is true, terms are combined with the OR clause""" 
    ...
Note that the class keyword is often replaced by klass or cls:

def factory(klass, *args, **kwargs): 
    return klass(*args, **kwargs) 
Let's take a look at some of the best practices to keep in mind while working with arguments.

In [None]:
Best practices for arguments
The signatures of functions and methods are the guardians of code integrity. They drive its usage and build its APIs. Besides the naming rules that we have discussed previously, special care has to be taken for arguments. This can be done through the following three simple rules:

Build arguments by iterative design.
Trust the arguments and your tests.
Use *args and **kwargs magic arguments carefully.

In [None]:
Building arguments by iterative design
Having a fixed and well-defined list of arguments for each function makes the code more robust. But this can't be done in the first version, so arguments have to be built by iterative design. They should reflect the precise use cases the element was created for, and evolve accordingly.

Consider the following example of the first versions of some Service class:

class Service:  # version 1
    def _query(self, query, type):
        print('done')

    def execute(self, query):
        self._query(query, 'EXECUTE')
If you want to extend the signature of the execute() method with new arguments in a way that preserves backward compatibility, you should provide default values for these arguments as follows: 

class Service(object):  # version 2
    def _query(self, query, type, logger):
        logger('done')

    def execute(self, query, logger=logging.info):
        self._query(query, 'EXECUTE', logger)
The following example from an interactive session presents two styles of calling the execute() method of the updated Service class: 

>>> Service().execute('my query')    # old-style call 
>>> Service().execute('my query', logging.warning) 
WARNING:root:done

In [None]:
Trusting the arguments and your tests
Given the dynamic typing nature of Python, some developers use assertions at the top of their functions and methods to make sure the arguments have proper content, for example:

def divide(dividend, divisor): 
    assert isinstance(dividend, (int, float)) 
    assert isinstance(divisor, (int, float)) 
    return dividend / divisor
This is often done by developers who are used to static typing and feel that something is missing in Python.

This way of checking arguments is a part of the Design by Contract (DbC) programming style, where preconditions are checked before the code is actually run.

The two main problems in this approach are as follows:

 

DbC's code explains how it should be used, making it less readable
This can make it slower, since the assertions are made on each call
The latter can be avoided with the -O option of the Python interpreter. In that case, all assertions are removed from the code before the byte code is created, so that the checking is lost.

In any case, assertions have to be done carefully, and should not be used to bend Python to a statically typed language. The only use case for this is to protect the code from being called nonsensically. If you really want to have some kind of static typing in Python, you should definitely try MyPy or a similar static type checker that does not affect your code runtime and allows you to provide type definitions in a more readable form as function and variable annotations.

In [None]:
Using *args and **kwargs magic arguments carefully
The *args and **kwargs arguments can break the robustness of a function or method. They make the signature fuzzy, and the code often starts to become a small argument parser where it should not, for example:

def fuzzy_thing(**kwargs):
    if 'do_this' in kwargs:
        print('ok i did this')

    if 'do_that' in kwargs:
        print('that is done')

    print('ok')

>>> fuzzy_thing(do_this=1)
ok i did this
ok
>>> fuzzy_thing(do_that=1)
that is done
ok
>>> fuzzy_thing(what_about_that=1)
ok
If the argument list gets long and complex, it is tempting to add magic arguments. But this is more a sign of a weak function or method that should be broken into pieces or refactored.

When *args is used to deal with a sequence of elements that are treated the same way in the function, asking for a unique container argument such as an iterator is better, for example:

def sum(*args):  # okay
    total = 0
    for arg in args:
        total += arg
    return total

def sum(sequence):  # better!
    total = 0
    for arg in sequence:
        total += arg
    return total
For **kwargs, the same rule applies. It is better to fix the named arguments to make the method's signature meaningful, for example:

def make_sentence(**kwargs):
    noun = kwargs.get('noun', 'Bill')
    verb = kwargs.get('verb', 'is')
    adjective = kwargs.get('adjective', 'happy')
    return f'{noun} {verb} {adjective}'

def make_sentence(noun='Bill', verb='is', adjective='happy'):
    return f'{noun} {verb} {adjective}'
Another interesting approach is to create a container class that groups several related arguments to provide an execution context. This structure differs from *args or **kwargs because it can provide internals that work over the values, and can evolve independently. The code that uses it as an argument will not have to deal with its internals.

For instance, a web request passed on to a function is often represented by an instance of a class. This class is in charge of holding the data passed by the web server, as shown in the following code:

def log_request(request):  # version 1 
    print(request.get('HTTP_REFERER', 'No referer'))  

def log_request(request):  # version 2 
    print(request.get('HTTP_REFERER', 'No referer')) 
    print(request.get('HTTP_HOST', 'No host')) 
Magic arguments cannot be avoided sometimes, especially in metaprogramming. For instance, they are indispensable in the creation of decorators that work on functions with any kind of signature.

Let's discuss class names in the next section.

In [None]:
Class names
The name of a class has to be concise, precise, and descriptive. A common practice is to use a suffix that informs about its type or nature, for example:

SQLEngine
MimeTypes
StringWidget
TestCase
For base or abstract classes, a Base or Abstract prefix can be used as follows:

BaseCookie
AbstractFormatter
The most important thing is to be consistent with the class attributes. For example, try to avoid redundancy between the class and its attributes' names as follows:

>>> SMTP.smtp_send()  # redundant information in the namespace 
>>> SMTP.send()       # more readable and mnemonic            
Let's take a look at module and package names in the next section.

In [None]:
Module and package names
The module and package names inform about the purpose of their content. The names are short, in lowercase, and usually without underscores, for example:

sqlite
postgres
sha1
They are often suffixed with lib if they are implementing a protocol, as in the following:

import smtplib 
import urllib 
import telnetlib
When choosing a name for a module, always consider its content and limit the amount of redundancy within the whole namespace, for example:

from widgets.stringwidgets import TextWidget  # bad 
from widgets.strings import TextWidget        # better 
When a module is getting complex and contains a lot of classes, it is a good practice to create a package and split the module's elements into other modules.

The __init__ module can also be used to put back some common APIs at the top level of the package. This approach allows you to organize the code into smaller components without reducing the ease of use.

Let's take a look at some of the useful tools used while working with naming conventions and styles.

In [None]:
Useful tools
Common conventions and practices used in a software project should always be documented. But having proper documentation for guidelines is often not enough to enforce that these guidelines are actually followed. Fortunately, you can use automated tools that can check sources of your code and verify if it meets specific naming conventions and style guidelines.

The following are a few popular tools:

pylint: This is a very flexible source code analyzer
pycodestyle and flake8: This is a small code style checker and a wrapper that adds to it some more useful features, such as static analysis and complexity measurement

In [None]:
Pylint
Besides some quality assurance metrics, Pylint allows for checking of whether a given source code is following a naming convention. Its default settings correspond to PEP 8 and a Pylint script provides a shell report output.

To install Pylint, you can use pip as follows:

$ pip install pylint
After this step, the command is available and can be run against a module, or several modules using wildcards. Let's try it on Buildout's bootstrap.py script as follows:

$ wget -O bootstrap.py https://bootstrap.pypa.io/bootstrap-buildout.py -q
$ pylint bootstrap.py
No config file found, using default configuration
************* Module bootstrap
C: 76, 0: Unnecessary parens after 'print' keyword (superfluous-parens)
C: 31, 0: Invalid constant name "tmpeggs" (invalid-name)
C: 33, 0: Invalid constant name "usage" (invalid-name)
C: 45, 0: Invalid constant name "parser" (invalid-name)
C: 74, 0: Invalid constant name "options" (invalid-name)
C: 74, 9: Invalid constant name "args" (invalid-name)
C: 84, 4: Import "from urllib.request import urlopen" should be placed at the top of the module (wrong-import-position)

...


Global evaluation
-----------------
Your code has been rated at 6.12/10
Real Pylint's output is a bit longer and here it has been truncated for the sake of brevity.

Remember that Pylint can often give you false positive warnings that decrease the overall quality rating. For instance, an import statement that is not used by the code of the module itself is perfectly fine in some cases (for example, building top-level __init__ modules in a package). Always treat Pylint's output as a hint and not an oracle. 

Making calls to libraries that are using mixedCase for methods can also lower your rating. In any case, the global evaluation of your code score is not that important. Pylint is just a tool that points you to places where there is the possibility for improvements.

It is always recommended to do some tuning of Pylint. In order to do so you need to create a .pylinrc configuration file in your project's root directory. You can do that using the following -generate-rcfile option of the pylint command:

$ pylint --generate-rcfile > .pylintrc 
This configuration file is self-documenting (every possible option is described with comment) and should already contain every available Pylint configuration option.

Besides checking for compliance with some arbitrary coding standards, Pylint can also give additional information about the overall code quality, such as:

Code duplication metrics
Unused variables and imports
Missing function, method, or class docstrings
Too long function signatures
The list of available checks that are enabled by default is very long. It is important to know that some of the rules are very arbitrary and cannot always be easily applied to every code base. Remember that consistency is always more valuable than compliance to some arbitrary rules. Fortunately, Pylint is very tunable, so if your team uses some naming and coding conventions that are different from the ones assumed by default, you can easily configure Pylint to check for consistency with your own conventions.

In [None]:
pycodestyle and flake8
pycodestyle (formerly pep8) is a tool that has only one purpose; it provides only style checking against code conventions defined in PEP 8. This is the main difference from Pylint that has many more additional features. This is the best option for programmers that are interested in automated code style checking only for the PEP 8 standard, without any additional tool configuration, as in Pylint's case.

pycodestyle can be installed with pip as follows:

$ pip install pycodestyle
When run on the Buildout's bootstrap.py script, it will give the following short list of code style violations:

$ wget -O bootstrap.py https://bootstrap.pypa.io/bootstrap-buildout.py -q
$ pycodestyle bootstrap.py
bootstrap.py:118:1: E402 module level import not at top of file bootstrap.py:119:1: E402 module level import not at top of file bootstrap.py:190:1: E402 module level import not at top of file bootstrap.py:200:1: E402 module level import not at top of file
The main difference from Pylint's output is its length. pycodestyle concentrates only on style, so it does not provide any other warnings, such as unused variables, too long function names, or missing docstrings. It also does not give a rating. And it really makes sense because there is no such thing as partial consistency or partial conformance. Any, even the slightest, violation of style guidelines makes the code immediately inconsistent.

The code of pycodestyle is simpler than Pylint's and its output is easier to parse, so it may be a better choice if you want to make your code style verification part of a continuous integration process. If you are missing some static analysis features, there is the flake8 package that is a wrapper on pycodestyle and a few other tools that are easily extendable and provide a more extensive suite of features. These include the following:

McCabe complexity measurement
Static analysis via pyflakes
Disabling whole files or single lines using comments

In [None]:
Summary
In this chapter, we have discussed the most common and widely accepted coding conventions. We started with the official Python style guide (the PEP 8 document). The official style guide was complemented by some naming suggestions that will make your future code more explicit. We have also seen some useful tools that are indispensable in maintaining the consistency and quality of your code.

All of this prepares us for the first practical topic of the book—writing and distributing your own packages. In the next chapter, we will learn how to publish our very own package on a public PyPI repository and also how to leverage the power of packaging ecosystems in your private organization.

In [None]:
Writing a Package
This chapter focuses on a repeatable process of writing and releasing Python packages. We will see how to shorten the time needed to set up everything before starting the real work. We will also learn how to provide a standardized way to write packages and ease the use of a test-driven development approach. We will finally learn how to facilitate the release process.

It is organized into the following four parts:

A common pattern for all packages that describes the similarities between all Python packages, and how distutils and setuptools play a central role the packaging process.
What are namespace packages and why they can be useful?
How to register and upload packages in the Python Package Index (PyPI) with emphasis on security and common pitfalls.
The standalone executables as an alternative way to package and distribute Python applications.
In this chapter, we will cover the following topics:

Creating a package
Namespace packages
Uploading a package
Standalone executables

In [None]:
Technical requirements
Here are Python packages mentioned in this chapter that you can download from PyPI:

twine
wheel
cx_Freeze
py2exe
pyinstaller
You can install these packages using following command:

python3 -m pip install <package-name>
Code files for this chapter can be found at https://github.com/PacktPublishing/Expert-Python-Programming-Third-Edition/tree/master/chapter7.

In [None]:
Creating a package
Python packaging can be a bit overwhelming at first. The main reason for that is the confusion about proper tools for creating Python packages. Anyway, once you create your first package, you will see that this is not as hard as it looks. Also, knowing proper, state-of-the art packaging tools helps a lot.

You should know how to create packages even if you are not interested in distributing your code as open source. Knowing how to make your own packages will give you more insight in the packaging ecosystem and will help you to work with third-party code that is available on PyPI that you are probably already using.

Also, having your closed source project or its components available as source distribution packages can help you to deploy your code in different environments. The advantages of leveraging the Python packaging ecosystem in the code deployment process will be described in more detail in the next chapter. Here we will focus on proper tools and techniques to create such distributions.

We'll discuss the confusing state of Python package tools in the next section.

In [None]:
The confusing state of Python packaging tools
The state of Python packaging was very confusing for a long time and it took many years to bring organization to this topic. Everything started with the distutils package introduced in 1998, which was later enhanced by setuptools in 2003. These two projects started a long and knotted story of forks, alternative projects, and complete rewrites that tried to (once and for all) fix the Python packaging ecosystem. Unfortunately, most of these attempts never succeeded. The effect was quite the opposite. Each new project that aimed to supersede setuptools or distutils only added to the already huge confusion around packaging tools. Some of such forks were merged back to their ancestors (such as distribute which was a fork of setuptools) but some were left abandoned (such as distutils2).

Fortunately, this state is gradually changing. An organization called the Python Packaging Authority (PyPA) was formed to bring back the order and organization to the packaging ecosystem. The Python Packaging User Guide (https://packaging.python.org), maintained by PyPA, is the authoritative source of information about the latest packaging tools and best practices. Treat that site as the best source of information about packaging and complementary reading for this chapter. This guide also contains a detailed history of changes and new projects related to packaging. So it is worth reading it, even if you already know a bit about packaging, to make sure you still use the proper tools.

Stay away from other popular internet resources, such as The Hitchhiker's Guide to Packaging. It is old, not maintained, and mostly obsolete. It may be interesting only for historical reasons, and the Python Packaging User Guide is in fact a fork of this old resource.

Let's take a look at the effect of PyPA on Python packaging.

In [None]:
The current landscape of Python packaging thanks to PyPA
PyPA, besides providing an authoritative guide for packaging, also maintains packaging projects and a standardization process for new official aspects of Python packaging. All of PyPA's projects can be found under a single organization on GitHub: https://github.com/pypa.


Some of them were already mentioned in the book. The following are the most notable:

pip
virtualenv
twine
warehouse
Note that most of them were started outside of this organization and were moved under PyPA patronage when they become mature and widespread solutions.

Thanks to PyPA engagement, the progressive abandonment of the eggs format in favor of wheels for built distributions has already happened. Also thanks to the commitment of the PyPA community, the old PyPI implementation was finally totally rewritten in the form of  the Warehouse project. Now, PyPI has got a modernized user interface and many long-awaited usability improvements and features.

In the next section, we'll take a look at some of the tools recommended while working with packages.

In [None]:
Tool recommendations
The Python Packaging User Guide gives a few suggestions on recommended tools for working with packages. They can be generally divided into the following two groups:

Tools for installing packages
Tools for package creation and distribution
Utilities from the first group recommended by PyPA were already mentioned in Chapter 2, Modern Python Development Environments, but let's repeat them here for the sake of consistency:

Use pip for installing packages from PyPI.
Use virtualenv or venv for application-level isolation of the Python runtime environment.
The Python Packaging User Guide recommendations of tools for package creation and distribution are as follows:

Use setuptools to define projects and create source distributions.
Use wheels in favor of eggs to create built distributions.
Use twine to upload package distributions to PyPI.


In [None]:
Project configuration
It should be obvious that the easiest way to organize the code of big applications is to split them into several packages. This makes the code simpler, easier to understand, maintain, and change. It also maximizes the reusability of your code. Separate packages act as components that can be used in various programs.

In [None]:
setup.py
The root directory of a package that has to be distributed contains a setup.py script. It defines all metadata as described in the distutils module. Package metadata is expressed as arguments in a call to the standard setup() function. Despite distutils being the standard library module provided for the purpose of code packaging, it is actually recommended to use the setuptools instead. The setuptools package provides several enhancements over the standard distutils module.

Therefore, the minimum content for this file is as follows:

from setuptools import setup 
 
setup( 
    name='mypackage', 
) 
name gives the full name of the package. From there, the script provides several commands that can be listed with the --help-commands option, as shown in the following code:

$ python3 setup.py --help-commands
Standard commands:
  build             build everything needed to install
  clean             clean up temporary files from 'build' command
  install           install everything from build directory
  sdist             create a source distribution (tarball, zip file, etc.)
  register          register the distribution with the Python package index
  bdist             create a built (binary) distribution
  check             perform some checks on the package
  upload            upload binary package to PyPI

Extra commands:
  bdist_wheel       create a wheel distribution
  alias             define a shortcut to invoke one or more commands
  develop           install package in 'development mode'

usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: setup.py --help [cmd1 cmd2 ...]
   or: setup.py --help-commands
   or: setup.py cmd --help
The actual list of commands is longer and can vary depending on the available setuptools extensions. It was truncated to show only those that are most important and relevant to this chapter. Standard commands are the built-in commands provided by distutils, whereas extra commands are the ones provided by third-party packages, such as setuptools or any other package that defines and registers a new command. Here, one such extra command registered by another package is bdist_wheel, provided by the wheel package.

In [None]:
setup.cfg
The setup.cfg file contains default options for commands of the setup.py script. This is very useful if the process for building and distributing the package is more complex and requires many optional arguments to be passed to the setup.py script commands. This setup.cfg file allows you to store such default parameters together with your source code on a per project basis. This will make your distribution flow independent from the project and also provides transparency about how your package was built/distributed to the users and other team members.

The syntax for the setup.cfg file is the same as provided by the built-in configparser module so it is similar to the popular Microsoft Windows INI files. Here is an example of the setup.cfg configuration file that provides some global, sdist, and bdist_wheel commands' defaults:

[global] 
quiet=1 
 
[sdist] 
formats=zip,tar 
 
[bdist_wheel] 
universal=1
This example configuration will ensure that source distributions (sdist section) will always be created in two formats (ZIP and TAR) and the built wheel distributions (bdist_wheel section) will be created as universal wheels that are independent from the Python version. Also most of the output will be suppressed on every command by the global --quiet switch. Note that this option is included here only for demonstration purposes and it may not be a reasonable choice to suppress the output for every command by default.

In [None]:
MANIFEST.in
When building a distribution with the sdist command, the distutils module browses the package directory looking for files to include in the archive. By default distutils will include the following:

All Python source files implied by the py_modules, packages, and scripts arguments
All C source files listed in the ext_modules argument
Files that match the glob pattern test/test*.py
Files named README, README.txt, setup.py, and setup.cfg
Besides that, if your package is versioned with a version control system such as Subversion, Mercurial, or Git, there is the possibility to auto-include all version controlled files using additional setuptools extensions such as setuptools-svn, setuptools-hg , and setuptools-git. Integration with other version control systems is also possible through other custom extensions. No matter if it is the default built-in collection strategy or one defined by custom extension, the sdist will create a MANIFEST file that lists all files and will include them in the final archive.

Let's say you are not using any extra extensions, and you need to include in your package distribution some files that are not captured by default. You can define a template called MANIFEST.in in your package root directory (the same directory as setup.py file). This template directs the sdist command on which files to include.

This MANIFEST.in template defines one inclusion or exclusion rule per line:

include HISTORY.txt 
include README.txt 
include CHANGES.txt 
include CONTRIBUTORS.txt 
include LICENSE 
recursive-include *.txt *.py
The full list of the MANIFEST.in commands can be found in the official distutils documentation.

In [None]:
Most important metadata
Besides the name and the version of the package being distributed, the most important arguments that the setup() function can receive are as follows:

description: This includes a few sentences to describe the package.
long_description: This includes a full description that can be in reStructuredText (default) or other supported markup languages.
long_description_content_type: this defines MIME type of long description; it is used to tell the package repository what kind of markup language is used for the package description.
keywords: This is a list of keywords that define the package and allow for better indexing in the package repository.
author: This is the name of the package author or organization that takes care of it.
author_email: This is the contact email address.
url: This is the URL of the project.
license: This is the name of the license (GPL, LGPL, and so on) under which the package is distributed.
packages: This is a list of all package names in the package distribution; setuptools provides a small function called find_packages that can automatically find package names to include.
namespace_packages: This is a list of namespace packages within package distribution.

In [None]:
Trove classifiers
PyPI and distutils provide a solution for categorizing applications with the set of classifiers called trove classifiers. All trove classifiers form a tree-like structure. Each classifier string defines a list of nested namespaces where every namespace is separated by the :: substring. Their list is provided to the package definition as a classifiers argument of the setup() function.

Here is an example list of classifiers taken from solrq project available on PyPI:

from setuptools import setup 
 
setup( 
    name="solrq", 
    # (...) 
 
    classifiers=[ 
        'Development Status :: 4 - Beta', 
        'Intended Audience :: Developers', 
        'License :: OSI Approved :: BSD License', 
        'Operating System :: OS Independent', 
        'Programming Language :: Python', 
        'Programming Language :: Python :: 2', 
        'Programming Language :: Python :: 2.6', 
        'Programming Language :: Python :: 2.7', 
        'Programming Language :: Python :: 3', 
        'Programming Language :: Python :: 3.2', 
        'Programming Language :: Python :: 3.3', 
        'Programming Language :: Python :: 3.4', 
        'Programming Language :: Python :: Implementation :: PyPy', 
        'Topic :: Internet :: WWW/HTTP :: Indexing/Search', 
    ], 
) 
Trove classifiers are completely optional in the package definition but provide a useful extension to the basic metadata available in the setup() interface. Among others, trove classifiers may provide information about supported Python versions, supported operating systems, the development stage of the project, or the license under which the code is released. Many PyPI users search and browse the available packages by categories so a proper classification helps packages to reach their target.

Trove classifiers serve an important role in the whole packaging ecosystem and should never be ignored. There is no organization that verifies packages classification, so it is your responsibility to provide proper classifiers for your packages and not introduce chaos to the whole package index.

At the time of writing this book, there are 667 classifiers available on PyPI that are grouped into the following nine major categories:

Development status
Environment
Framework
Intended audience
License
Natural language
Operating system
Programming language
Topic
This list is ever-growing, and new classifiers are added from time to time. It is thus possible that the total count of them will be different at the time you read this. The full list of currently available trove classifiers is available at https://pypi.org/classifiers/.

In [None]:
Common patterns
Creating a package for distribution can be a tedious task for unexperienced developers. Most of the metadata that setuptools or distuitls accept in their setup() function call can be provided manually ignoring the fact that this metadata may be also available in other parts of the project. Here is an example:

from setuptools import setup 
 
setup( 
    name="myproject", 
    version="0.0.1", 
    description="mypackage project short description", 
    long_description=""" 
        Longer description of mypackage project 
        possibly with some documentation and/or 
        usage examples 
    """, 
    install_requires=[ 
        'dependency1', 
        'dependency2', 
        'etc', 
    ] 
)
Some of the metadata elements are often found in different places in a typical Python project. For instance, content of long description is commonly included in the project's README file, and it is a good convention to put a version specifier in the  __init__ module of the package. Hardcoding such package metadata as setup() function arguments redundancy to the project that allows for easy mistakes and inconsistencies in future. Both setuptools and distutils cannot automatically pick metadata information from the project sources, so you need to provide it yourself. There are some common patterns among the Python community for solving the most popular problems such as dependency management, version/readme inclusion, and so on. It is worth knowing at least a few of them because they are so popular that they could be considered as packaging idioms.

In [None]:
Automated inclusion of version string from package
The PEP 440 Version Identification and Dependency Specification document specifies a standard for version and dependency specification. It is a long document that covers accepted version specification schemes and defines how version matching and comparison in Python packaging tools should work. If you are using or plan to use a complex project version numbering scheme, then you should definitely read this document carefully. If you are using a simple scheme that consists just of one, two, three, or more numbers separated by dots, then you don't have to dig into the details of PEP 440. If you don't know how to choose the proper versioning scheme, I greatly recommend following the semantic versioning scheme that was already briefly mentioned in Chapter 1, Current Status of Python.

The other problem related to code versioning is where to include that version specifier for a package or module. There is PEP 396 (Module Version Numbers) that deals exactly with this problem. PEP 396 is only an informational document and has a deferred status, so it is not a part of the official Python standards track. Anyway, it describes what seems to be a de facto standard now. According to PEP 396, if a package or module has a specific version defined, the version specifier should be included as a __version__ attribute of package root __init__.py INI file or distributed module file. Another de facto standard is to also include the VERSION attribute that contains the tuple of the version specifier parts. This helps users to write compatibility code because such version tuples can be easily compared if the versioning scheme is simple enough.

So many packages available on PyPI follow both conventions. Their __init__.py files contain version attributes that look like the following:

# version as tuple for simple comparisons 
VERSION = (0, 1, 1) 
# string created from tuple to avoid inconsistency 
__version__ = ".".join([str(x) for x in VERSION]) 
The other suggestion of PEP 396 is that the version argument provided in the setup() function of the setup.py script should be derived from __version__, or the other way around. The Python Packaging User Guide features multiple patterns for single-sourcing project versioning, and each of them has its own advantages and limitations. My personal favorite is rather long and is not included in the PyPA's guide, but has the advantage of limiting the complexity only to the setup.py script. This boilerplate assumes that the version specifier is provided by the VERSION attribute of the package's __init__ module and extracts this data for inclusion in the setup() call. Here is an excerpt from some imaginary package's setup.py script that illustrates this approach:

from setuptools import setup
import os


def get_version(version_tuple):
    # additional handling of a,b,rc tags, this can
    # be simpler depending on your versioning scheme
    if not isinstance(version_tuple[-1], int):
        return '.'.join(
            map(str, version_tuple[:-1])
        ) + version_tuple[-1]
    return '.'.join(map(str, version_tuple))

# path to the packages __init__ module in project
# source tree
init = os.path.join(
    os.path.dirname(__file__), 'src', 'some_package',
    '__init__.py'
)

version_line = list(
    filter(lambda l: l.startswith('VERSION'), open(init))
)[0]

# VERSION is a tuple so we need to eval 'version_line'.
# We could simply import it from the package but we
# cannot be sure that this package is importable before
# installation is done.
PKG_VERSION = get_version(eval(version_line.split('=')[-1]))

setup(
    name='some-package',
    version=PKG_VERSION,
    # ...
)

In [None]:
README file
The Python Package Index can display the project's README file or the value of long_description on the package page in the PyPI portal. PyPI is able to interpret the markup used in the long_description content and render it as HTML on the package page. The type of markup language is controlled through the long_description_content_type argument of the setup() call. For now, there are the following three choices for markup available:

Plain text with long_description_content_type='text/plain'
reStructuredText with long_description_content_type='text/x-rst'
Markdown with long_description_content_type='text/markdown'
Markdown and reStructuredText are the most popular choices among Python developers, but some might still want to use different markup languages for various reasons. If you want to use something different as your markup language for your project's README, you can still provide it as a project description on the PyPI page in a readable form. The trick lies in using the pypandoc package to translate your other markup language into reStructuredText (or Markdown) while uploading the package to the Python Package Index. It is important to do it with a fallback to plain content of your README file, so the installation won't fail if the user has no pypandoc installed. The following is an example of a setup.py script that is able to read the content of the README file written in AsciiDoc markup language and translate it to reStructuredText before including a long_description argument: 

from setuptools import setup
try:
    from pypandoc import convert

    def read_md(file_path):
        return convert(file_path, to='rst', format='asciidoc')

except ImportError:
    convert = None
    print(
        "warning: pypandoc module not found, "
        "could not convert Asciidoc to RST"
    )

    def read_md(file_path):
         with open(file_path, 'r') as f:
            return f.read()

README = os.path.join(os.path.dirname(__file__), 'README')

setup(
    name='some-package',
    long_description=read_md(README),
    long_description_content_type='text/x-rst',
    # ...
)

In [None]:
Managing dependencies
Many projects require some external packages to be installed in order to work properly. When the list of dependencies is very long, there comes a question as to how to manage it. The answer in most cases is very simple. Do not over-engineer it. Keep it simple and provide the list of dependencies explicitly in your setup.py script as follows:

from setuptools import setup
setup( 
    name='some-package', 
    install_requires=['falcon', 'requests', 'delorean'] 
    # ... 
) 
Some Python developers like to use requirements.txt files for tracking lists of dependencies for their packages. In some situations, you might find some reason for doing that, but in most cases, this is a relic of times where the code of that project was not properly packaged. Anyway, even such notable projects as Celery still stick to this convention. So if you are not willing to change your habits or you are somehow forced to use requirement files, then at least do it properly. Here is one of the popular idioms for reading the list of dependencies from the requirements.txt file:

from setuptools import setup 
import os 
 
 
def strip_comments(l): 
    return l.split('#', 1)[0].strip() 
 
 
def reqs(*f): 
    return list(filter(None, [strip_comments(l) for l in open( 
        os.path.join(os.getcwd(), *f)).readlines()])) 
 
setup( 
    name='some-package', 
    install_requires=reqs('requirements.txt') 
    # ... 
)
In next section, you'll learn how to add custom commands to your setup script.

In [None]:
The custom setup command
distutils allows you to create new commands. A new command can be registered with an entry point, which was introduced by setuptools as a simple way to define packages as plugins.

An entry point is a named link to a class or a function that is made available through some APIs in setuptools. Any application can scan for all registered packages and use the linked code as a plugin.

To link the new command, the entry_points metadata can be used in the setup call as follows:

setup( 
    name="my.command", 
    entry_points=""" 
        [distutils.commands] 
        my_command = my.command.module.Class 
    """ 
) 
All named links are gathered in named sections. When distutils is loaded, it scans for links that were registered under distutils.commands.

This mechanism is used by numerous Python applications that provide extensibility.

Let's see how to work with packages during the development stage.

In [None]:
Working with packages during development
Working with setuptools is mostly about building and distributing packages. However, you still need to use setuptools to install packages directly from project sources. And the reason for that is simple. It is a good habit to test if our packaging code works properly before submitting your package to PyPI. And the simplest way to test it is by installing it. If you send a broken package to the repository, then in order to re-upload it, you need to increase the version number.

Testing if your code is packaged properly before the final distribution saves you from unnecessary version number inflation and obviously from wasting your time. Also, installation directly from your own sources using setuptools may be essential when working on multiple related packages at the same time.

In [None]:
setup.py install
The install command installs the package in your current Python environment. It will try to build the package if no previous build was made and then inject the result into the filesystem directory where Python is looking for installed packages. If you have an archive with a source distribution of some package, you can decompress it in a temporary folder and then install it with this command. The install command will also install dependencies that are defined in the install_requires argument. Dependencies will be installed from the Python Package Index.

An alternative to the bare setup.py script when installing a package is to use pip. Since it is a tool that is recommended by PyPA, you should use it even when installing a package in your local environment just for development purposes. In order to install a package from local sources, run the following command:

pip install <project-path>

In [None]:
Uninstalling packages
Amazingly, setuptools and distutils lack the uninstall command. Fortunately, it is possible to uninstall any Python package using pip as follows:

pip uninstall <package-name>
Uninstalling can be a dangerous operation when attempted on system-wide packages. This is another reason why it is so important to use virtual environments for any development.

In [None]:
setup.py develop or pip -e
Packages installed with setup.py install are copied to the site-packages directory of your current Python environment. This means that whenever you make a change to the sources of that package, you are required to reinstall it. This is often a problem during intensive development because it is very easy to forget about the need to perform installation again. This is why setuptools provides an extra develop command that allows you to install packages in the development mode. This command creates a special link to project sources in the deployment directory (site-packages) instead of copying the whole package there. Package sources can be edited without the need for reinstallation and are available in the sys.path as if they were installed normally.

pip also allows you to install packages in such a mode. This installation option is called editable mode and can be enabled with the -e parameter in the install command as follows:

pip install -e <project-path>
Once you install the package in your environment in editable mode, you can freely modify the installed package in place and all the changes will be immediately visible without the need to reinstall the package.

Let's take a look at namespace packages in the next section.

CopyAdd HighlightAdd Note

In [None]:
Namespace packages
The Zen of Python that you can read after writing import this in the interpreter session says the following about namespaces:

Namespaces are one honking great idea-let's do more of those!
And this can be understood in at least two ways. The first is a namespace in context of the language. We all use the following namespaces without even knowing:

The global namespace of a module
The local namespace of the function or method invocation
The class namespace
The other kind of namespaces can be provided at the packaging level. These are namespace packages. This is often an overlooked feature of Python packaging that can be very useful in structuring the package ecosystem in your organization or in a very large project.

In [None]:
Why is it useful?
Namespace packages can be understood as a way of grouping related packages, where each of these packages can be installed independently.

Namespace packages are especially useful if you have components of your application developed, packaged, and versioned independently but you still want to access them from the same namespace. This also helps to make clear to which organization or project every package belongs. For instance, for some imaginary Acme company, the common namespace could be acme. Therefore this organization could create the general acme namespace package that could serve as a container for other packages from this organization. For example, if someone from Acme wants to contribute to this namespace with, for example, an SQL-related library, they can create a new acme.sql package that registers itself in the acme namespace.

It is important to know what's the difference between normal and namespace packages and what problem they solve. Normally (without namespace packages), you would create a package called acme with an sql subpackage/submodule with the following file structure:

$ tree acme/
acme/
├── acme
│   ├── __init__.py
│   └── sql
│       └── __init__.py
└── setup.py

2 directories, 3 files
Whenever you want to add a new subpackage, let's say templating, you are forced to include it in the source tree of acme as follows:

$ tree acme/
acme/
├── acme
│   ├── __init__.py
│   ├── sql
│   │   └── __init__.py
│   └── templating
│       └── __init__.py
└── setup.py

3 directories, 4 files
Such an approach makes independent development of acme.sql and acme.templating almost impossible. The setup.py script will also have to specify all dependencies for every subpackage. So it is impossible (or at least very hard) to have an installation of some of the acme components optional. Also, with enough subpackages it is practically impossible to avoid dependency conflicts.

With namespace packages, you can store the source tree for each of these subpackages independently as follows:

$ tree acme.sql/
acme.sql/
├── acme
│   └── sql
│       └── __init__.py
└── setup.py

2 directories, 2 files

$ tree acme.templating/
acme.templating/
├── acme
│   └── templating
│       └── __init__.py
└── setup.py

2 directories, 2 files
And you can also register them independently in PyPI or any package index you use. Users can choose which of the subpackages they want to install from the acme namespace as follows, but they never install the general acme package (it doesn't even have to exist):

$ pip install acme.sql acme.templating
Note that independent source trees are not enough to create namespace packages in Python. You need a bit of additional work if you don't want your packages to not overwrite each other. Also proper handling may be different depending on the Python language version you target. Details of that are described in the next two sections.

In [None]:
Multithreading
Developers often consider threading to be a very complex topic. While this statement is totally true, Python provides high-level classes and functions that ease the usage of threading. CPython implementation of threads unfortunately comes with some inconvenient details that make them less useful than in other languages. They are still completely fine for some sets of problems that you may want to solve, but not for as many as in C or Java.

In this section, we will discuss the limitations of multithreading in CPython, as well as the common concurrent problems for which Python threads are still a viable solution.