In [23]:
import time

import numpy as np

# Python & asterisks

Note: Jupyter notebook prints the output of the last thing that executes in a cell, that isn't assigned to a variable. This is why I can avoid using `print` in most cases.

Note: This tutorial applies to Python 3.5+; there may be some slight differences in behaviour versus Python 2.7.

This is a tutorial giving some examples on the usage of the asterisk (*) in Python.

### Multiply

The first major use of the asterisk is, of course, for multiplication:

In [2]:
(1.2 + 3.1j) * 2  # complex number multiplied by an integer

(2.4+6.2j)

We can also "multiply" strings and iterables:

In [3]:
a = [1, "hi" * 3, np.zeros(5)] * 2

a  

[1,
 'hihihi',
 array([0., 0., 0., 0., 0.]),
 1,
 'hihihi',
 array([0., 0., 0., 0., 0.])]

The call to `np.zeros` is made before the list is doubled, and both the above arrays refer to the same memory locations:

In [4]:
a[2][0] = np.pi

print(a[2][0] == a[5][0])

a

True


[1,
 'hihihi',
 array([3.14159265, 0.        , 0.        , 0.        , 0.        ]),
 1,
 'hihihi',
 array([3.14159265, 0.        , 0.        , 0.        , 0.        ])]

It's worth mentioning that it's possible to define how the multiplication operator behaves for a class by overriding the methods `__mult__` and `__rmult__`, though I've never used this myself. See [this StackOverflow answer](https://stackoverflow.com/questions/5181320/under-what-circumstances-are-rmul-called/5182501#5182501) for an explanation of the difference between these.

For example, let's make a class that uses `*` as the matrix multiplication operator. (Normally, NumPy arrays use `*` for elementwise multiplication and `@` for matrix multiplication.)

In [27]:
class MulMatrix:
    def __init__(self, A):
        self.A = np.array(A)
        
    def __mul__(self, other):
        return self.A @ other.A
    
    # we don't need to define __rmul__ as long as both arguments to the operator are MulMatrix instances,
    # as Python first tries to apply the leftmost argument's __mul__ method, and this always succeeds
    # when the rightmost argument is also a MulMatrix

# tril and triu are NumPy methods that return lower and upper triangular matrices
lower = np.tril(np.ones((5, 5)))  
upper = np.triu(np.ones((5, 5)))

lower_mat = MulMatrix(lower)
upper_mat = MulMatrix(upper)

print(lower * upper)
print(lower_mat * upper_mat)
print(upper_mat * lower_mat)

[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
[[1. 1. 1. 1. 1.]
 [1. 2. 2. 2. 2.]
 [1. 2. 3. 3. 3.]
 [1. 2. 3. 4. 4.]
 [1. 2. 3. 4. 5.]]
[[5. 4. 3. 2. 1.]
 [4. 4. 3. 2. 1.]
 [3. 3. 3. 2. 1.]
 [2. 2. 2. 2. 1.]
 [1. 1. 1. 1. 1.]]


### Splat

The second major category of use is as the "splat" operator, which packs or unpacks function arguments. For example,

In [37]:
# whatever arguments are passed to `f` are packed in `args` (a tuple, i.e. immutable list)
def f(*args):
    print(args)
    return sum(args)

f(1, 2, 3, 4, 5)

(1, 2, 3, 4, 5)
15


A function's _parameters_ are the placeholders in their definition, while _arguments_ are the objects actually passed to it. In this case, the parameter is `args`, plus the splat operator. The arguments in the function call are `1, 2, 3, 4, 5`. 

If the asterisk is put before an argument instead of a parameter, it unpacks the argument. For example, by placing the arguments in a list, the above function call is equivalent to:

In [38]:
b = [1, 2, 3, 4, 5]

f(*b)

(1, 2, 3, 4, 5)
15


The single splat is useful when an arbitrary number of arguments are all treated similarly by the function, except perhaps for their ordering. For example, it can be used in a function that concatenates strings:

In [31]:
def concatenate(*args):
    # I'd normally use ''.join directly, though
    return ''.join(args)

concatenate("Hello", " ", "world")

'Hello world'

For the same reason, it may be useful for numerical functions when it is desirable to pass a bunch of values as arguments:

In [60]:
def mean(*args):
    return sum(args) / len(args)

mean(1, 2, 3, 4, 5, 6)

3.5

I don't use the single splat much in practice, as I will typically pass a list (or other iterable) rather than passing multiple arguments:

In [33]:
def mean(args):
    return sum(args) / len(args)

values = [1, 2, 3, 4, 5, 6]
mean(values)

3.5

The double splat operator `**` is used to pack and unpack dictionaries. When used on a parameter, this means packing *named* arguments:

In [47]:
def g(**kwargs):
    print(kwargs)
    return {''.join(kwargs.keys()): sum(kwargs.values())}

g(foo=1, bar=-1)

{'foo': 1, 'bar': -1}


{'foobar': 0}

Likewise with arguments:

In [49]:
c = {'foo': 100, 'bar': -100}

g(**c)

{'foo': 100, 'bar': -100}


{'foobar': 0}

The double splat can be useful to avoid repetition in function definitions with default arguments:

In [18]:
def name_string(surname="Smith", given_name="Omar"):
    #
    return "{} {}".format(given_name, surname)

name_string()

'Omar Smith'

In [19]:
def name_nation_string(nation="Persia", **kwargs):
    #
    return "{}, {}".format(name_string(**kwargs), nation)

name_nation_string(surname="Khayyam")

'Omar Khayyam, Persia'

Here, when arguments with the keywords `surname` or `given_name` are passed to `name_nation_string`, they will be passed to name_string, and override its default arguments. 

Other keyword arguments (e.g. `birthdate="1048/05/18"`) passed to `name_nation_string` will also be passed along to `name_string`, but in this case will raise an error; `name_string` takes arguments only for the two keywords mentioned earlier. But if we add `**kwargs` to the end of `name_string`'s parameter list as well, it will accept other keyword arguments passed from `name_nation_string`(or any other function). If `kwargs` isn't used in the function, either directly or by passing on to another, then it is simply ignored (i.e. it can capture superfluous arguments without raising an error, though this should be avoided unless there is an explicit reason for doing so).

This example is very simple (we better need both of these functions) but `**kwargs` can be very useful in more structured settings; for example, it can be used in a class's constructor method(s) to prevent repeating the entire list of parameters and default arguments in the class's `__init__` method: 

In [None]:
class PolymathHandler:
    def __init__(self, given_name=None, surname=None, nation="Persia", mathematician=True, poet=True):
        self.given_name = given_name
        self.surname = surname
        self.nation = nation
        # etc...
        
    @classmethod
    def from_name_string(cls, name_string, **kwargs):
        """Given a full name string, split into given and surnames and continue with instantiation.
        
        Note the use of `**kwargs` rather than repeating `nation="Persia", mathematician=True, poet=True`."""
        given_name, surname = name_string.split(' ')
        # since this is a method of Spider, cls=Spider when the method is called, so this returns an instance
        return cls(given_name=given_name, surname=surname, **kwargs)
        

Note the use of `@classmethod`, which is a built-in "decorator" that alters a methods to take their class ("cls" by convention) rather than the instance ("self") as the first argument. This allows class methods to behave as alternative constructors that return instances of the class. This is used similarly to constructor overloading (e.g. in Java) but requires that explicitly different names are used for different constructors.

I mentioned "decorators". What are those, in general? Whenever a method (or function) definition is preceded by `@some_function`, this is equivalent to nesting the defined function inside the decorator function:

In [None]:
@some_function
def do_things(*args):
    return args

# the above is equivalent to:
def do_things(*args):
    return args

do_things = some_function(do_things)

# this will return an error since some_function is undefined; this is only a syntax example

In other words, a decorator is a function wrapper; it takes a function and returns a function with modified behaviour. This can be used, for example, to time functions (by nesting it between calls to `time.time`) without needing to repeat the timing code in each function to be timed---the timing decorator needs to be defined once and can be applied to whichever functions need to be timed.

To illustrate this, here's an example of a timing decorator:

In [26]:
def print_exec_time(func):
    """Time the execution of `func` and print."""
    def wrapper(*args, **kwargs):
        start_time = time.time()
        result = func(*args, **kwargs)
        exec_time = time.time() - start_time
        print(exec_time)
        return result
    
    return wrapper


def wait(dur=1):
    time.sleep(dur)
    
# instead of using the `@` notation above, apply the decorator explicitly:  
wait = print_exec_time(wait)  # returns "wrapper" which replaces the original "wait"
    
wait(1)

1.0010981559753418


If you've never seen decorators this nested definition might be confusing. When the decorator `@print_exec_time` is applied to a function or method, it "replaces" the definition of that function with `wrapper`, which calls (and returns the same value as) the original function along with any number of additional operations (in this case, calling `time.time` before and after the function call). This decorator can be applied functions with any number of positional or named arguments; the `wrapper` simply captures anything that would have been sent to the undecorated function, using `*args` and `**kwargs`, and passes it along.

Finally, note that the variables `args` and `kwargs` are only conventions, and other names can be used in the same way, though this is generally bad for readability. I suppose if the extra arguments will (for example) always be passed to a specific function, more descriptive names could be used, though I have never done this.