# Miscellaneous

Note: some examples will use the following text file as data to process:

In [1]:
import os.path
text_path = os.path.join('examples', 'der_rote_komet.txt')

# Collections

## Index of iteration

In [2]:
line_number = 0  # Necessary for empty files where no line is read.
with open(text_path, 'r', encoding='utf-8') as text_file:
    for line_number, line in enumerate(text_file, 1):
        print(line_number, line.rstrip('\n\r'))
print('lines read: ', line_number)

1 "Siehst du die purpurne Röte, die in gerader Linie sich herab auf
2 die Erde senkt?" fragte Romulus Futurus in größter Aufregung seinen
3 Freund John Crofton, den berühmten Berichterstatter des "New York
4 Herald" in Berlin. "Bist du nun überzeugt, daß ich die Wahrheit
5 gesprochen habe? Noch kannst du den roten Kometen nicht erkennen,
6 und niemand wird imstande sein, ihn mit bloßem Auge zu sehen. Aber
7 jetzt gibst du zu, daß meine Diagnose richtig war?"
8                             (Robert Heymann, 1909, "Der rote Komet")
9 
lines read:  9


## Find out if any item in a sequence matches a condition

In [3]:
with open(text_path, 'r', encoding='utf-8') as text_file:
    has_ue = any('ü' in line for line in text_file)
print(has_ue)

True


Note: `any()` stops processing line as soon as the first line containing an `ü` is found.

## Find out if all items in a sequence match a conditon

In [4]:
with open(text_path, 'r', encoding='utf-8') as text_file:
    has_e_in_every_line = all('e' in line for line in text_file)
print(has_e_in_every_line)

False


Note: all() stops processing lines as soon as the first line not containingan `e` is found.

## Intial items matching a condition

In [5]:
import itertools

with open(text_path, 'r', encoding='utf-8') as text_file:
    for line in itertools.takewhile(lambda line: 'ö' in line, text_file):
        print(line.rstrip('\n\r'))

"Siehst du die purpurne Röte, die in gerader Linie sich herab auf
die Erde senkt?" fragte Romulus Futurus in größter Aufregung seinen


# Named tuples

* Named tuples are similar to tuples.
* Use names instead of indices to access items.
* Quick way to implement "read only classes" that only have attributes.

In [6]:
from collections import namedtuple
from datetime import date

Person = namedtuple('Person', ['name', 'size', 'date_of_birth'])

alice = Person('Alice', 172, date(1987, 3, 15))

alice

Person(name='Alice', size=172, date_of_birth=datetime.date(1987, 3, 15))

In [7]:
alice.date_of_birth

datetime.date(1987, 3, 15)

## Modules to skim

Recommendation: skim the documentation of the following modules, they can contain various functions that are helpful for regular tasks when processing sequences:
* [collections](https://docs.python.org/3/library/collections.html)
* [itertools](https://docs.python.org/3/library/itertools.html)
* [functools](https://docs.python.org/3/library/functools.html)

# Decorators

* Decorators can enhance functions and classes.
* They start with `@`.
* The standard library already defines several decorators.
* You can implement your own.

## Caching

In [8]:
from functools import lru_cache
import requests

@lru_cache(maxsize=32)
def get_web_page(url):
    response = requests.get(url)
    response.raise_for_status()
    return response.text

get_web_page('https://www.python.org')
get_web_page('https://www.python.org')
get_web_page('https://www.python.org')

get_web_page.cache_info()

CacheInfo(hits=2, misses=1, maxsize=32, currsize=1)

## Static methods

* Classes usually contain methods that want to access its internal state (using `self`).
* Sometimes a method is only necessary for a class but independet of `self`.
* To avoid name clashes, it's better to make such methods part of the class instad of putting it somewhere in the global name space.

## Example static method (1/2)

Recall how our `Person` class computes the age of a person:

In [9]:
from datetime import date

class Person(object):
    def __init__(self, name, size=None, date_of_birth=None):
        self.name = name
        self.size = size
        self.date_of_birth = date_of_birth

    def age(self):
        if self.date_of_birth is None:
            result = None
        else:
            today = date.today()
            born_earlier_this_year = \
                (today.month, today.day) \
                < (self.date_of_birth.month, self.date_of_birth.day)
            result = today.year - self.date_of_birth.year - born_earlier_this_year
        return result

## Example static method (2/2)

In [10]:
class Person(object):
    def __init__(self, name, size=None, date_of_birth=None):
        self.name = name
        self.size = size
        self.date_of_birth = date_of_birth

    @staticmethod
    def _years_between(date1, date2):  # no 'self' here
        if (date1 is None) or (date2 is None):
            result = None
        else:
            is_date1_earlier_this_year = \
                (date1.month, date1.day) < (date2.month, date2.day)
            result = date1.year - date2.year - is_date1_earlier_this_year
        return result        

    def age(self):
        return Person._years_between(date.today, self.date_of_birth)

# Shell commands

* Python can call external shell commands.
* Advantages to regular shell scripts:
  * More deterministic error handling (exit code not equel 0 raises an `Exception`).
  * Output can be intercepted and postprocessed with Python.
  * All Python functions are available to handle strings, filter lines, convert dates etc, which can be painful in pure shell scripts.

## The `subprocess` module

The `subprocess` module provides means to call external programs:

In [11]:
import subprocess
subprocess.check_call(['ls', '-l', 'examples'])

0

## Fail on exit code != 0

In case the called programm returns an exit code other than 0, `check_call` raises a `CalledProcessError`:

In [12]:
subprocess.check_call(['ls', '-l', 'no_such_folder'])

CalledProcessError: Command '['ls', '-l', 'no_such_folder']' returned non-zero exit status 2

## Continue on exit code != 0

Using `call` the exit code is simply returned and the caller has to act on errors by itself:

In [13]:
subprocess.call(['ls', '-l', 'no_such_folder'])

2

## Using the output

The `check_output()` function return the output of the console command as binary string:

In [14]:
import sys
out = subprocess.check_output(['ls', '-l', 'examples'])
out

b'total 168\n-rw-rw-r-- 1 roskakori roskakori   1748 M\xc3\xa4r  7 01:05 copytool.py\n-rw-rw-r-- 1 roskakori roskakori    227 M\xc3\xa4r  9 18:37 csvdict.py\n-rw-rw-r-- 1 roskakori roskakori    221 M\xc3\xa4r  9 10:13 csvlist.py\n-rw-rw-r-- 1 roskakori roskakori    527 Feb 19 13:52 der_rote_komet.txt\n-rw-rw-r-- 1 roskakori roskakori    181 M\xc3\xa4r  7 10:10 logconsole.py\n-rw-rw-r-- 1 roskakori roskakori    674 M\xc3\xa4r  9 21:18 myapp.cfg\n-rw-rw-r-- 1 roskakori roskakori   1126 M\xc3\xa4r  9 21:11 myapp.py\n-rw-rw-r-- 1 roskakori roskakori     13 M\xc3\xa4r 10 00:21 numbers.txt\n-rw-rw-r-- 1 roskakori roskakori    480 M\xc3\xa4r  2 19:01 people.xml\n-rw-rw-r-- 1 roskakori roskakori     84 M\xc3\xa4r  9 10:20 persons.csv\ndrwxrwxr-x 2 roskakori roskakori   4096 M\xc3\xa4r  3 01:12 __pycache__\n-rw-rw-r-- 1 roskakori roskakori 114296 M\xc3\xa4r  1 05:09 pycharm_test_runner.png\n-rw-rw-r-- 1 roskakori roskakori     75 J\xc3\xa4n 20 07:56 some.txt\n-rw-rw-r-- 1 roskakori roskakori   

## Converting the output to strings

Depending on the platform, you can decode the binary output to a list of Python strings. On Ubuntu, this might work:

In [15]:
out.decode('utf-8').split('\n')

['total 168',
 '-rw-rw-r-- 1 roskakori roskakori   1748 Mär  7 01:05 copytool.py',
 '-rw-rw-r-- 1 roskakori roskakori    227 Mär  9 18:37 csvdict.py',
 '-rw-rw-r-- 1 roskakori roskakori    221 Mär  9 10:13 csvlist.py',
 '-rw-rw-r-- 1 roskakori roskakori    527 Feb 19 13:52 der_rote_komet.txt',
 '-rw-rw-r-- 1 roskakori roskakori    181 Mär  7 10:10 logconsole.py',
 '-rw-rw-r-- 1 roskakori roskakori    674 Mär  9 21:18 myapp.cfg',
 '-rw-rw-r-- 1 roskakori roskakori   1126 Mär  9 21:11 myapp.py',
 '-rw-rw-r-- 1 roskakori roskakori     13 Mär 10 00:21 numbers.txt',
 '-rw-rw-r-- 1 roskakori roskakori    480 Mär  2 19:01 people.xml',
 '-rw-rw-r-- 1 roskakori roskakori     84 Mär  9 10:20 persons.csv',
 'drwxrwxr-x 2 roskakori roskakori   4096 Mär  3 01:12 __pycache__',
 '-rw-rw-r-- 1 roskakori roskakori 114296 Mär  1 05:09 pycharm_test_runner.png',
 '-rw-rw-r-- 1 roskakori roskakori     75 Jän 20 07:56 some.txt',
 '-rw-rw-r-- 1 roskakori roskakori    505 Mär  3 00:56 test_divided_using_pytes

# Environment variables

Python maps environment variables to a dictionary:

In [16]:
import os
os.environ['USER']

'roskakori'

In [17]:
os.environ['DUMMY'] = 'whatever'

There are also more traditional functions:

In [18]:
os.getenv('USER')

'roskakori'

In [19]:
os.unsetenv('DUMMY')

# Summary

* Python has utility modules to help processing sequences.
* Decorators can enhance functions and methods.
* The `subprocess` module can call external commands and mix Python and Shell scripts.
* Environment variables can be accessed using the `os.environ` dictionary.