# Variables
## Basic Types

### Integer

In [1]:
i = 3
type(i), i

(int, 3)

In [2]:
i = int(3.8) # explicit casting from other type (here float)
type(i), i

(int, 3)

In [3]:
5/3, 6/3, 5//3

(1.6666666666666667, 2.0, 1)

Division of 2 integers using / give a float (independent if result is an integer or not).

The operator // gives the integer part of the division.

In [4]:
i = 10**100 + 1
i

10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001

Integers in Python have unlimited length, i.e. no risk of overflow!

### Float

In [5]:
i = 4.0
type(i), i

(float, 4.0)

In [6]:
j = 3
z = i*j
type(z), z

(float, 12.0)

In [7]:
y = 2.2 * 3
type(y), y, y == 6.6

(float, 6.6000000000000005, False)

Multiplication of float with int gives float.

Note that float is binary-based and has limited precision (like in almost any other programming language). 
Therefore, be careful with comparisons:

### Decimal

The Decimal data type has a fixed number of digits and can be used as an alternative to float if precision is relevant (e.g. for currency values). Note that Decimal is significantly slower than float.

In [8]:
from decimal import Decimal

In [9]:
y2 = Decimal('2.2') * Decimal(3)
type(y2), y2, y2 == Decimal('6.6')

(decimal.Decimal, Decimal('6.6'), True)

### Complex Numbers

In [10]:
c1 = complex(real=1, imag=-2)
type(c1), c1

(complex, (1-2j))

In [11]:
c1.real, c1.imag, c1.conjugate()

(1.0, -2.0, (1+2j))

In [12]:
abs(c1)

2.23606797749979

In [13]:
import cmath
cmath.exp(c1)

(-1.1312043837568135-2.4717266720048188j)

Python has a built-in data structure for complex numbers. 
Mathematical functions supporting complex numbers are in the *cmath* module of the standard library (instead of the *math* module for real numbers).

### String

In [14]:
s1 = 'Hello World!'
type(s1), s1

(str, 'Hello World!')

In [15]:
j = 3.6
s2 = str(j) # explicit casting
type(s2), s2

(str, '3.6')

In [16]:
s1[3], s1[3:-2]

('l', 'lo Worl')

access to characters and slicing - counting from 0 to n-1, negative numbers are from the end

In [17]:
sql = """
select * 
from {table}
where col1 = 'hallo' and col2 = 'welt' 
""".format(table='myTable')
print(sql)


select * 
from myTable
where col1 = 'hallo' and col2 = 'welt' 



Definition of strings with multiple rows using """str""" or '''str'''.

If the string text contains ' use " as string delimiter and vice versa, this removes the need for escaping these characters.

In [18]:
s5 = 'hello\nWorld!' # \ is the escape character, \n is line break
print(s5)

hello
World!


In [19]:
s6 = r'hello \n World!'
print(s6)

hello \n World!


definition of a "raw" string ignores escape character - this is useful especially for regular expressions and Windows-format paths

#### String generation

In [20]:
s1 + ' ' + s2 + ', ' + str(42)

'Hello World! 3.6, 42'

In [21]:
s1*3

'Hello World!Hello World!Hello World!'

Simple concatination

In [22]:
'; '.join([s1, s2])

'Hello World!; 3.6'

In [23]:
' is followed by '.join(str(x) for x in range(10))

'0 is followed by 1 is followed by 2 is followed by 3 is followed by 4 is followed by 5 is followed by 6 is followed by 7 is followed by 8 is followed by 9'

join a list/ iterable of strings with pre-defined separation string

In [24]:
'Hi {}, how are you?'.format('John') 
# string formating is a nicer way to construct strings

'Hi John, how are you?'

In [25]:
'Hi {name}, your waiting number is {number}'.format(number=42, name='Bob') 
# no need to cast data types to string here

'Hi Bob, your waiting number is 42'

In [26]:
'Temperature: {temp:4.1f} degrees, humidity: {:4.2f} %'.format(
    24.54821, temp=23.1234567) 

'Temperature: 23.1 degrees, humidity: 24.55 %'

String generation using format (recommended method)

In [27]:
'Temperature: %4.1f degrees' % 23.235

'Temperature: 23.2 degrees'

Old method (prior to Python 2.6), not recommended anymore

#### String manipulation

In [28]:
s1.upper()

'HELLO WORLD!'

In [29]:
s1.lower()

'hello world!'

In [30]:
s3 = " hello world!   "
s3.strip() # removes leading and trailing whitespaces

'hello world!'

### Boolean

In [31]:
b1 = True
type(b1), b1

(bool, True)

In [32]:
b2 = 5 == 3 + 4
type(b2), b2

(bool, False)

In [33]:
b3 = 5 != 3 + 4
type(b3), b3

(bool, True)

### None Type

In [34]:
no = None
type(no), no

(NoneType, None)

In [35]:
no is None, 'hello' is None # recommended way to check for None values

(True, False)

In [36]:
no == None, 'hello' == None # not recommended

(True, False)

In [37]:
x = 'standard' if no is None else no # assign standard value to variable
x

'standard'

Used as a sentinel value for missing entries, omitted input values, etc.

# Collections
## List

In [38]:
l1 = [1, 5.0, 'Hallo', 'Welt']
type(l1), l1

(list, [1, 5.0, 'Hallo', 'Welt'])

In [39]:
l1[0], l1[2], l1[-1]

(1, 'Hallo', 'Welt')

counting from 0 to n-1, negative numbers are from the end

In [40]:
l1[1:3], l1[2:], l1[:-2]

([5.0, 'Hallo'], ['Hallo', 'Welt'], [1, 5.0])

slicing: indicates all from start / end

In [41]:
l1.append(42)
l1

[1, 5.0, 'Hallo', 'Welt', 42]

add a single element to list

In [42]:
lx = ['x', 'y']
l1.extend(lx)
l1

[1, 5.0, 'Hallo', 'Welt', 42, 'x', 'y']

adds multiple elements (e.g. an other list) to list

In [43]:
l1 + lx

[1, 5.0, 'Hallo', 'Welt', 42, 'x', 'y', 'x', 'y']

concats lists, but does not change l1 or lx

In [44]:
l2 = list(range(10000))
l2[0:10]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Construct list from range iterator (i.e. all numbers from 0 to n-1)

#### Construction using list commprehension

In [45]:
[i**2 + 1 for i in range(20) if i % 2 != 0]

[2, 10, 26, 50, 82, 122, 170, 226, 290, 362]

In [46]:
l3 = []
for i in range(20):
    if i % 2 != 0:
        l3.append(i**2 + 1)
l3

[2, 10, 26, 50, 82, 122, 170, 226, 290, 362]

The list comprehension is much shorter and more understandable than the corresponding "classical" implementation.

#### Check membership in list

In [47]:
10 in l2

True

In [48]:
-1 in l2

False

In [49]:
%timeit (-1 in l2)

225 µs ± 144 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)


#### Loop over list

In [50]:
for item in l1:
    print(item)

1
5.0
Hallo
Welt
42
x
y


In [51]:
for nr in range(len(l1)): # bad way of getting item number and content
    print('number:', nr, 'content:', l1[nr])

number: 0 content: 1
number: 1 content: 5.0
number: 2 content: Hallo
number: 3 content: Welt
number: 4 content: 42
number: 5 content: x
number: 6 content: y


In [52]:
for nr, item in enumerate(l1): # recommended way
    print('number:', nr, 'content:', item)

number: 0 content: 1
number: 1 content: 5.0
number: 2 content: Hallo
number: 3 content: Welt
number: 4 content: 42
number: 5 content: x
number: 6 content: y


Use the built-in enumerate function if you need both the number of the item and the item itself. 

### Tuple

An ordered collection analogue to a list, but immutable, i.e. it cannot be changed after creation.

In [53]:
t1 = (1, 2, 3, 4) # brackets not required for definition here
type(t1), t1 # note that this is itself a tuple, too

(tuple, (1, 2, 3, 4))

In [54]:
t1[1]

2

In [55]:
try:
    t1[1] = 5 # this raises a TypeError
except TypeError as e:
    print(e)

'tuple' object does not support item assignment


Typical use cases for tuples:
* If a function returns multiple values
* Standard values for function parameters
* Anywhere a list would be used, which does not need to be changed after creation (tuples are more memory-efficient)

### Named Tuple

In [56]:
from collections import namedtuple

In [57]:
CoordTuple = namedtuple('CoordTuple', ['x', 'y', 'z']) # note that CoordTuple is a class
coord1 = CoordTuple(x=10, y=-7, z=0.96)
type(coord1), coord1

(__main__.CoordTuple, CoordTuple(x=10, y=-7, z=0.96))

In [58]:
coord1.x, coord1[1] # access via variable name or position

(10, -7)

In [59]:
try:
    coord2 = CoordTuple(x=-10, y=42) # raises TypeError: all variables must be defined
except TypeError as e:
    print(e)

__new__() missing 1 required positional argument: 'z'


In [60]:
coord2 = CoordTuple(x=-10, y=42, z=None)
coord2

CoordTuple(x=-10, y=42, z=None)

Named Tuple is a class factory for data containers with named variables.

## Set

In [61]:
set1 = {3, 5, 4, 8, 9, 9}
type(set1), set1

(set, {3, 4, 5, 8, 9})

In [62]:
set2 = set(range(10000))

In [63]:
set1.add(10)
set1

{3, 4, 5, 8, 9, 10}

In [64]:
set0 = set() # empty set needs to be generated with set(), {} creates empty dictionary

A set contains only unique members. It is a very efficient data structure due to hashing.

#### Construction using set commprehension

In [65]:
{i**2 + 1 for i in range(20) if i % 2 != 0}

{2, 10, 26, 50, 82, 122, 170, 226, 290, 362}

#### Comparing sets

In [66]:
set0 = {-2, 3, 4, 8, 15}
set1 - set0, set0 - set1 # checking which value is in one set but not the other

({5, 9, 10}, {-2, 15})

In [67]:
set0.union(set1)

{-2, 3, 4, 5, 8, 9, 10, 15}

In [68]:
set0.intersection(set1)

{3, 4, 8}

#### Check membership in set

In [69]:
100 in set2

True

In [70]:
-1 in set2

False

In [71]:
%timeit (-1 in set2)

73 ns ± 0.0389 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [72]:
set3 = set(range(1000000))

In [73]:
%timeit (-1 in set3)

78.2 ns ± 0.0498 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


Checking membership in large sets is much faster than in lists.

Lists are O(n) whereas sets are O(1) - in the example, a set 100 times larger still gives the same time for the check.

### Frozen Set

In [74]:
fs = frozenset(set3)

In [75]:
%timeit (-1 in fs) # membership check is also O(n)

73 ns ± 0.043 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


This is an immutable set, i.e. the analogon of what a tuple is for a list.

### Dictionaries

In [76]:
d1 = {'a': 1, 'b': 2, 'c': 3}
type(d1), d1

(dict, {'a': 1, 'b': 2, 'c': 3})

In [77]:
d1['b']

2

In [78]:
try:
    d1['x'] # raises KeyError because key does not exist
except KeyError as e:
    print(e)

'x'


In [79]:
d1.get('c', 'nothing found'), d1.get('x', 'nothing found') # definition of default value

(3, 'nothing found')

In [80]:
d1['d'] = 4 # add new key/ value pair
d1

{'a': 1, 'b': 2, 'c': 3, 'd': 4}

In [81]:
d1['a'] = 10 # update existing value
d1

{'a': 10, 'b': 2, 'c': 3, 'd': 4}

In [82]:
d1.keys(), d1.values(), d1.items()

(dict_keys(['a', 'b', 'c', 'd']),
 dict_values([10, 2, 3, 4]),
 dict_items([('a', 10), ('b', 2), ('c', 3), ('d', 4)]))

In [83]:
d2 = dict(aa=31, bb=32, cc=33, b=42) # alternative constructor
d2

{'aa': 31, 'bb': 32, 'cc': 33, 'b': 42}

In [84]:
dall = {**d1, **d2} # join 2 dictionaries, duplicate keys are ignored
dall

{'a': 10, 'b': 42, 'c': 3, 'd': 4, 'aa': 31, 'bb': 32, 'cc': 33}

Dictionaries contain key-value pairs. Keys must be unique and hashable (e.g. int, str). Values could be of any type (including lists, other dictionaries).

Access to dictionaries is very efficient due to hashing of keys. They correspond to HashMaps in Java.

Note that the order of the key-value pairs in the dictionary is somewhat arbitrary (and dependent on Python version), therefore do not rely on it (use list if ordering is important)!

#### Construction using dict commprehension

In [85]:
{i: i**2 + 1 for i in range(20) if i % 2 != 0}

{1: 2, 3: 10, 5: 26, 7: 50, 9: 82, 11: 122, 13: 170, 15: 226, 17: 290, 19: 362}

#### Using dictionaries and lists, JSON like data structures can be created:

In [86]:
data = {'label': 'example_data',
        'description': """example of json like data structure in Python""",
        'entries': [
            {'label': 'point1', 'value': 12},
            {'label': 'point2', 'value': 19},
            ],
        'meta': {
            'creation_date': '2019-02-31', #use real date structure here to avoid mistakes like this
            'user': 'MrSmith',
            }
}
data

{'label': 'example_data',
 'description': 'example of json like data structure in Python',
 'entries': [{'label': 'point1', 'value': 12},
  {'label': 'point2', 'value': 19}],
 'meta': {'creation_date': '2019-02-31', 'user': 'MrSmith'}}

#### Looping over dictionaries

In [87]:
for key in d1.keys(): # bad way
    print(key, d1[key])

a 10
b 2
c 3
d 4


In [88]:
for key, value in d1.items(): # recommended way
    print(key, value)

a 10
b 2
c 3
d 4


#### DefaultDict

In [89]:
from collections import defaultdict

In [90]:
sample_string = 'mississippi'

In [91]:
ddict1 = defaultdict(int)
for char in sample_string:
    ddict1[char] += 1
ddict1

defaultdict(int, {'m': 1, 'i': 4, 's': 4, 'p': 2})

In [92]:
ddict2 = defaultdict(list)
for pos, char in enumerate(sample_string):
    ddict2[char].append(pos)
ddict2

defaultdict(list,
            {'m': [0], 'i': [1, 4, 7, 10], 's': [2, 3, 5, 6], 'p': [8, 9]})

A defaultdict is a dictionary which is initialized with the given data type (here int, i.e. value 0 or an empty list).
This allows to apply data type specific operations even for previously not existing keys.

# Typing

## Duck Typing

In [96]:
def double_me(x):
    return x*2

In [97]:
y = double_me(3)
type(y), y

(int, 6)

In [98]:
y = double_me(3.7)
type(y), y

(float, 7.4)

In [99]:
from decimal import Decimal
y = double_me(Decimal('3.34'))
type(y), y

(decimal.Decimal, Decimal('6.68'))

In [100]:
import numpy as np
a1 = np.random.rand(10,1)
y = double_me(a1)
type(y), y

(numpy.ndarray, array([[1.29271574],
        [1.6550845 ],
        [1.50829656],
        [0.80914176],
        [0.73888834],
        [1.373431  ],
        [0.66972989],
        [0.32720025],
        [1.37050522],
        [1.84639103]]))

In [101]:
y = double_me('hello world!')
type(y), y

(str, 'hello world!hello world!')

In [102]:
y = double_me([1, 3, 6])
type(y), y

(list, [1, 3, 6, 1, 3, 6])

In [103]:
try:
    y = double_me({1: 2, 2: 4}) # multiplication is not defined for a dictionary
except TypeError as e:
    print(e)

unsupported operand type(s) for *: 'dict' and 'int'


*"When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck."* - James Whitcomb Riley

The function defined above works on any numerical data type and also on some non-numerical data types (like strings and lists).

This is a fundamental design principle of Python: inputs and outputs of methods do not have pre-defined data types. The input variables could have any data type, as long as the operations in the method work for this type. The output data type(s) of a method is also dynamic, it may depend on the input variable data types or other factors.

There is no need (and no possibility) for method overloading (i.e. definition of methods with same name, but other input variable number and/or data types) in Python.

## Type Hints

In [104]:
def double_int(i: int) -> int:
    i_doubled: int = i*2
    return i_doubled

In [105]:
y = double_int(3)
type(y), y

(int, 6)

In [106]:
y = double_int('hello world') # no error raised even though input type is not int
type(y), y

(str, 'hello worldhello world')

While duck-typing is very powerful, it may also be dangerous if a variable type is passed to a function for which it does not produce the expected results (e.g. a string which is written 2 times instead of a number which is doubled).

Furthermore, in large programs it is often difficult to infer which types input and output variables have.

Type Hints can be used to indicate the expected type of variables and function input/ output. Note that Type Hints are little more than comments, the Python interpreter does not raise an error if the type is different to the one specified in the hint.
A linter, integrated for example in IDEs, however, can use the type hints to check code correctness and enhance auto-completion capability.


In [107]:
from typing import Union
def double_number(x: Union[int, float]) -> Union[int, float]:
    return x*2

In [108]:
y = double_number(3.9)
type(y), y

(float, 7.8)

In the typing module, there are several useful functions for type hints, like Union when multiple types are supported.

It is encouraged to use type hints anywhere the variable type is fixed. For functions/ variables which can benefit from duck typing do not use type hints.

In [109]:
def double_or_square(x, method: str):
    if method == 'double':
        return x*2
    elif method == 'square':
        return x**2
    else:
        raise ValueError('unknown method given')

Here, a type hint is only given for *method*, which must be a string. *x* and the return value may be of many types.

## Explicit Type Checks

In [110]:
def double_int2(i: int) -> int:
    assert isinstance(i, int), "i must be integer"
    return i*2

In [111]:
y = double_int2(3)
type(y), y

(int, 6)

In [112]:
try:
    y = double_int2(3.2) # raises AssertionError
except AssertionError as e:
    print(e)

i must be integer


Use *assert isinstance(variable, type), 'message'* for explicit type checking. The interpreter raises an error if the type check fails.

Author: Benjamin Lungwitz