# Underscores in Method and Variable Names

Ever see variable names like this and wonder what the hell is going on?
- _var
- var_
- __var
- \__var__
- _

## Single Leading Underscore
This method is intended for internal use, by convention only and will not be enforced by the interpreter.


In [4]:
class Test:
    
    def __init__(self):
        self.foo = 42
        self._bar = 100

In [6]:
t = Test()
print(t.foo)
print(t._bar)

42
100


The single leading underscore is more of a hint to the programmer that the variable should only be used internally within the class

On the other hand, if you name method starting with a leading underscore this will impact which methods in the module are imported

In [7]:
# some_module.py

def foo():
    return 42

def _bar():
    return 100

If I now import that module, only the `foo` function will be imported. Python will not import names starting with a leading underscore. This can be overridden with the `__all__` method and is explained further in this [notebook]( https://github.com/harpalsahota/DataScience/blob/master/Python/Misc/Stable_APIs.ipynb)

In [None]:
import some_module
some_module.foo() # Will work
some_module._bar() # Will not work, undefinded error

## Single Trailing Underscore

Sometimes the most fitting name for a variable is the already taken as keyword by Python. This can be addressed by using a single trailing underscore after the variable name



In [8]:
def return_list(list_):
    pass

## Double Leading Underscore

A leading double underscore is used to avoid naming conflicts in subclasses. Python does this by re-writing the attribute name. This is known as `name mangling`.

In [9]:
class Test:
    
    def __init__(self):
        self.foo = 42
        self._bar = 100
        self.__fizz = 200

In [10]:
t = Test()
dir(t)

['_Test__fizz',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_bar',
 'foo']

When I look at the attributes of the object I can see that `foo` and `_bar` are as expected. What’s happened to `__fizz`? It’s become name mangled and is now `_Test__fizz`. This is done to prevent the variable from being overridden by subclasses.

Let's make another class which subclasses the `Test` class:

In [12]:
class NewTest(Test):
    
    def __init__(self):
        super().__init__()
        self.foo = 'Overridded by NewTest'
        self._bar = 'Overridden by NewTest'
        self.__fizz = 'Overridden by NewTest'

In [13]:
t2 = NewTest()
print(t2.foo)
print(t2._bar)
print(t2.__fizz)

Overridded by NewTest
Overridden by NewTest


AttributeError: 'NewTest' object has no attribute '__fizz'

We go an `AttributeError` because the `__fizz` in `NewTest` got name mangled:

In [14]:
print(t2._NewTest__fizz)

Overridden by NewTest


We still have the `__fizz` in `Test` too:

In [15]:
t2._Test__fizz

200

Nice! As an FYI name mangling also applies to methods too!

## Double Leading and Trailing Underscores

Names with a double leading and trailing underscore will remain untouched by Python. However, these names are reserved for special use cases e.g. `__init__`

In [16]:
class Dunder:
    
    def __init__(self):
        self.__foo__ = 42

In [17]:
d = Dunder()
d.__foo__

42

## Single Underscore

By convention, a single underscore tells the user the variable is not used or is insignificant:

In [19]:
for _ in range(10):
    print('Printing 10 times')

Printing 10 times
Printing 10 times
Printing 10 times
Printing 10 times
Printing 10 times
Printing 10 times
Printing 10 times
Printing 10 times
Printing 10 times
Printing 10 times


Another example would be when unpacking a tuple, if you only require one item the other items could be named with a single underscore.

In [21]:
_, _, colour = ('Fiat', 10000, 'Red')

In [22]:
colour

'Red'

It should be noted that the `_` is a temporary variable in Python and represents the last expression evaluated by the interpreter (had to restart the interpreter because `_` was pointing to 10000 from the above tuple)

In [1]:
42 + 100

142

In [2]:
_

142

In [3]:
_ + 100

242

In [4]:
dict()

{}

In [6]:
_

{}

In [7]:
_['foo'] = 42

In [8]:
_

{'foo': 42}