

## Built In Functions

`len` returns the number of items in som kind of container objects, e.g. list or dictionary.

In [5]:
normal_list = [ 1, 2, 3, 4, 5 ]
len(normal_list)

5

In [6]:
# loop over items from back to front
for nl in reversed(normal_list):
    print( nl, end = " " )

5 4 3 2 1 

In [7]:
# access a instance's class name
normal_list.__class__.__name__

'list'

`enumarate` creates a list of tuples, where the first object in each tuple is the index and the second is the original item. e.g. the example below outputs all the lines in a file with line numbers. Like all sequences, enumerate is zero-based.

In [8]:
filename = 'contact.txt'
with open(filename) as file:
    for index, line in enumerate(file):
        print( '{0}: {1}'.format( index + 1, line ), end = '' )

1: first	last	email
2: john	smith	jsmith@example.com
3: jane	doan	janed@example.com
4: david	neilson	dn@example.com


`zip` it takes two or more sequences and creates a new sequence, where each element of the new sequence contains one element each from the original sequence.

In [9]:
# .strip remove leading and trailing whitespaces
contacts = []
with open(filename) as file:
    header = file.readline().strip().split('\t')
    for line in file:
        line = line.strip().split('\t')
        # the dict maps the first element of each tuple to a key
        # and the second value to a value
        contacts.append( dict( zip( header, line ) ) )

for contact in contacts:
    print("email: {email} -- {last}, {first}".format(**contact))

email: jsmith@example.com -- smith, john
email: janed@example.com -- doan, jane
email: dn@example.com -- neilson, david


we can "unzip" a zipped list of tuples by zipping it again.

In [10]:
list_one = [ 'a', 'b', 'c' ]
list_two = [ 1, 2, 3 ]
zipped = zip(list_one, list_two)
zipped = list(zipped)
print(zipped)

# unpack the parameters to pass these individual sequences as
# arguments to the zip function
unzipped = zip(*zipped)
list(unzipped)

[('a', 1), ('b', 2), ('c', 3)]


[('a', 'b', 'c'), (1, 2, 3)]

`min` and `max` like `sort` allows us to define the `key` argument.

In [11]:
def min_max_indexes(seq):
    # sort by the value and returns the index and value
    minimum = min( enumerate(seq), key = lambda s: s[1] )
    maximum = max( enumerate(seq), key = lambda s: s[1] )
    return minimum, maximum

In [12]:
alist = [5,0,1,4,6,3]
min_max_indexes(alist)

((1, 0), (4, 6))

## Comprehensions

optimized syntax for creating a list, set or dictionary fomr an existing sequence.

list comprehensions. e.g. convert strings to integers.

In [13]:
input_strings = ['1', '5', '28', '131', '3']
output_integers = [int(num) for num in input_strings]
output_integers

[1, 5, 28, 131, 3]

In [14]:
# exclude strings with more than two characters
[int(n) for n in input_strings if len(n) < 3]

[1, 5, 28, 3]

set comprehension. e.g. uses a namedtuple to model author/title/genre triads, and then retrieves a set of all the authors that write in a specific genre.

In [15]:
from collections import namedtuple
Book = namedtuple( "Book", "author title genre" )

books = [
    Book( "Pratchett", "Nightwatch", "fantasy" ),
    Book( "Pratchett", "Thief Of Time", "fantasy" ),
    Book( "Le Guin", "The Dispossessed", "scifi" ),
    Book( "Le Guin", "A Wizard Of Earthsea", "fantasy" ),
    Book( "Turner", "The Thief", "fantasy" ),
    Book( "Phillips", "Preston Diamond", "western" ),
    Book( "Phillips", "Twice Upon A Time", "scifi" ),
]
fantasy_author = { b.author for b in books if b.genre == 'fantasy' }
fantasy_author

{'Le Guin', 'Pratchett', 'Turner'}

## Generators

Used when we're just looping over items and don't care about having a final container of the object. To create generator comprehension, wraip it with `()`. e.g. when looping over large log files.

In [16]:
with open('log_file.log') as file:
    # if 'WARNING' is in the string, do not split the text file and then compare
    warnings = ( l for l in file if 'WARNING' in l )
    for l in warnings:
        print(l)







In [17]:
# read in the input log file and outputs the new log file that contains only the WARNING lines
"""
with open(inname) as infile:
    with open(outname, "w") as outfile:
        warnings = (l for l in infile if 'WARNING' in l)
        for l in warnings:
            outfile.write(l)
"""



Consider the example above, if we wanted to delete WARNING column from our output (a redundant column given that it only shows WARNING).

In [18]:
with open('log_file.log') as file:
    # remove the tab and WARNING
    warnings = ( l.replace( '\tWARNING', '' ) for l in file if 'WARNING' in l )
    for l in warnings:
        print(l)







Using the `yield` expression.

In [19]:
def warnings_filter(sequence):
    for l in sequence:
        if 'WARNING' in l:
            yield l.replace( '\tWARNING', '' )

with open('log_file.log') as file:
    warnings = warnings_filter(file)
    for warning in warnings:
        print(warning)







## Method Overloading

Having multiple methods with the same name that accept different set of arguments. **Note** that anything we provide as a default argument is evaluated when the function is first interpreted, not when it is called.

In [20]:
# be careful with providing empty containers,
# this will only create one list when the code is first constructed
def hello( b = [] ):
    b.append('a')
    print(b)
hello()
hello()

['a']
['a', 'a']


In [21]:
# write this instead
def hello( b = None ):
    if b is None:
        b = []
    b.append('a')
    print(b)
hello()
hello()

['a']
['a']


In python, we cam pass arbitrary lists and dictionaries. The `*` represents that the function accepts an arbitrary number of positional arguments. e.g. The `*links` says "I'll accept any number of arguments and put them all in a list.

In [22]:
def get_pages(*links):
    for link in links:
        # dowload the link with urllib
        print(link)
        
# all the function call below are valid
get_pages()
get_pages('http://www.archlinux.org')
get_pages('http://www.archlinux.org', 'http://ccphillips.net/')

http://www.archlinux.org
http://www.archlinux.org
http://ccphillips.net/


The `**` represents arbitary keyword argument. These arrive into the function as a dictionary. dictionary's `.update` is used add another dictionary's key-value pair to another dictionary.

In [23]:
class Options(object):
    
    default_options = {
        'port': 21,
        'host': 'localhost',
        'username': None,
        'password': None,
        'debug': False,
    }
    
    def __init__(self, **kwargs):
        # first make a copy of the class level dictionary
        # then we can use the update method to change the default values
        self.options = dict(Options.default_options)
        self.options.update(kwargs)
    
    def __getitem__(self, key):
        # allows us to use the new class using dictionary indexing syntax.
        # get the dictionary to return the value for you
        return self.options[key]

In [24]:
options = Options( username = "dusty", password = "drowssap", debug = True )
print( options['debug'] )
print( options['port'] )
print( options['username'] )

True
21
dusty


The method above can be dangerous as it is not explicit. Because it's possible to pass arbitrary keyword arugments to the dictionary ( or worse mispell a word ). Thus we could add some code to enforce this rule and document it in the class's definition.

In [25]:
import os
import shutil

def augmented_move( target_folder, *filenames, verbose = False, **specific ):
    """Move all filename into the target folder, allowing specific treatment of certain files"""
    
    def print_verbose( message, filename ):
        if verbose:
            print( message.format(filename) )
    
    for filename in filenames:
        target_path = os.path.join( target_folder, filename )
        if filename in specific:
            if specific[filename] == 'ignore':
                print_verbose( "Ignoring {0} ", filename )
            elif specific[filename] == 'copy':
                print_verbose( "Copying {0} ", filename )
                # shutil.copyfile( filename, target_path )
        else:
            print_verbose("Moving {0} ", filename)
            # shutil.move(filename, target_path)

In [26]:
# we have to pass the keyword argument for verbose or else 
# the function will treat it as another filename in the *filename list
augmented_move( "move_here", "four", "five", "six", verbose = True, four = "copy", five = "ignore" )

Copying four 
Ignoring five 
Moving six 


## Unpacking Argument

In [27]:
def show_args( arg1, arg2, arg3 = "THREE" ):
    print( arg1, arg2, arg3 )

some_args = range(3)
more_args = {
    "arg1": "ONE",
    "arg2": "TWO"
}

# when we have a list of arguments, we can use the * operator
# to unpack it into three arguments
print("Unpacking a sequence:", end=" ")
show_args(*some_args)

# if we have a dictionary of arguments, we can use the **syntax
# to unpack it 
print("Unpacking a dict:", end=" ")
show_args(**more_args)

Unpacking a sequence: 0 1 2
Unpacking a dict: ONE TWO THREE


Replacing methods of a class can be confusing to maintain, it is only used in testing code when we don't want the result to be actually sent to the client.

In [30]:
class A(object):

    def print(self):
        print("my class is A")

def fake_print():
    print("my class is not A")

a = A()
a.print()
a.print = fake_print
a.print()

my class is A
my class is not A
