In [138]:
# Configuration for the slideshow 
from traitlets.config.manager import BaseJSONConfigManager
path = "/home/quickbeam/anaconda3/envs/cogsys-python-intro/etc/jupyter/nbconfig"
cm = BaseJSONConfigManager(config_dir=path)
cm.update("livereveal", {
              "start_slideshow_at": "selected",
})


{'start_slideshow_at': 'selected'}

## Introduction

Goals:

- solid understanding of Python
- introduction to tools used throughout the courses

Me:

- Using Python since 2010
- Presented about Python internals at EuroPython 2016

# Basic Building Blocks

- Lay firm foundation for developing scripting skills
- Blocks: datatypes, conditionals, looping
- **Python 3**

## Numbers and Constants

In [2]:
# Intergers
3 + 4

7

In [3]:
# floats
3.0 - 4

-1.0

In [4]:
# Negative numbers are expressed exactly as you would expect:
- 2.9

-2.9

In [5]:
True, False

(True, False)

In [6]:
None

## Functions

Fun fact: in Python, functions are *objects*

It's a little-known (or acknowledged) fact about python, but functions are objects, first-class citizens of the type hiearchy.

For now we will just look quickly at how to define them, but we'll come back to some other things we can do with them later!

In [139]:
# Here's how you define one
def dummy():
    pass

In [140]:
# And here's how you call it
dummy()

Notice nothing happened?

Let's define a function that actually does something, albeit not very interesting.

In [9]:
def add(x, y):
    return x + y

In [10]:
add(3, 4)

7

**Exercise:** Implement a `subtract` function in terms of `add`!

### Useful Built-in Functions

In a Python prompt, use `help`.

In [11]:
help(add)

Help on function add in module __main__:

add(x, y)



In Jupyter notebooks you can use this shortcut: a question mark after the object you want help for.

In [12]:
add?

In [13]:
help?

Another function we'll be using a lot is `print`.
Compare the outputs of the following cells.

In [14]:
print(3)
print(4)

3
4


In [15]:
3
4

4

Use `type` to find out an objects "type".
Remeber our silly `add` function?

In [16]:
type(add)

function

### Methods

Functions attached to objects.

In [17]:
4.3.as_integer_ratio()

(4841369599423283, 1125899906842624)

In [18]:
(3).bit_length()

2

In [19]:
dir(3)

['__abs__',
 '__add__',
 '__and__',
 '__bool__',
 '__ceil__',
 '__class__',
 '__delattr__',
 '__dir__',
 '__divmod__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floor__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__index__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__invert__',
 '__le__',
 '__lshift__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__or__',
 '__pos__',
 '__pow__',
 '__radd__',
 '__rand__',
 '__rdivmod__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rfloordiv__',
 '__rlshift__',
 '__rmod__',
 '__rmul__',
 '__ror__',
 '__round__',
 '__rpow__',
 '__rrshift__',
 '__rshift__',
 '__rsub__',
 '__rtruediv__',
 '__rxor__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '__trunc__',
 '__xor__',
 'bit_length',
 'conjugate',
 'denominator',
 'from_bytes',
 'imag',
 'numerator',
 'real',
 'to_bytes']

## Collections

- encode more complex data
- leverage true power of computers

Sequences | Hashtables
:---: | :---:
Ordered | Unordered
item access by position | item access by name
can contain duplicates | all items are unique
**O(n)** lookup (slow) | Constant lookup (fast)

### Strings

No "character" type in Python, only strings.

Strings something between collections and singletons

- not truly compound data
- behave like sequences

In [20]:
"string"

'string'

In [21]:
'string'

'string'

In [22]:
# mixing quotes
"Don't stop me now, I'm having such a good time!"
'Java programmers often write "pythonic" code.'

'Java programmers often write "pythonic" code.'

In [23]:
# Strings, escaping quotes
'Don\'t stop me now!'

"Don't stop me now!"

Since strings are sequences, we can access substrings using their position index. In Python these position indeces start with `0`.

In [24]:
"string"[0]

's'

In [25]:
"string"[3]

'i'

In [26]:
"string"[4]

'n'

This is true for all sequences. 

Using an index greater than the length of the sequence results in an error.

In [27]:
"string"[10]

IndexError: string index out of range

Python sequences also support negative indeces, they simply go from the end. Note that they start with `-1`.

In [28]:
"string"[-1]

'g'

In [29]:
"string"[-3]

'i'

In [30]:
"string"[-2]

'n'

Lastly, we can grab substrings from a string by using slices instead of indeces.

In [31]:
"string"[:3]

'str'

In [32]:
"string"[3:]

'ing'

In [33]:
"string"[2:4]

'ri'

Negative indices can also be used in slices.

In [34]:
"string"[2:-2]

'ri'

In a slice we can also specify something called a `step`. Here's an example of how it works.

In [35]:
"very long string"[0:-1:2]

'vr ogsrn'

In this case we can omit the start and end indices, **but we must** keep the colons.

In [36]:
"very long string"[::2]

'vr ogsrn'

We can combine strings together using `+`.

In [37]:
"string1" + ' ' + "string2"

'string1 string2'

Note that no other arithmetic operators are defined however.

In [38]:
"string" - "abc"

TypeError: unsupported operand type(s) for -: 'str' and 'str'

Strings have a lot of useful text manipulation functions baked in.

In [39]:
print("hello world".upper())
print("hello world".title())
print("YYEEAAAHH!!".lower())

HELLO WORLD
Hello World
yyeeaaahh!!


In [40]:
print("hello world".isupper())
print("hello world".istitle())
print("YYEEAAAHH!!".islower())

False
False
False


Strings are unique among sequences in that it's possible to convert almost any type to them.

In [41]:
# We use a built-in function `repr` to highlight that the items were converted to strings
print(repr(str(4)))
print(repr(str(89.4134)))
print(repr(str(False)))

'4'
'89.4134'
'False'


One very common operation with any collection is checking whether it contains something.

In [42]:
"h" in "hello"

True

Strings are once again unique amoung collections in this regard because they allow checking subsequence membership.

In [43]:
"hell" in "hello"

True

### Tuples

Our first true compound datatype.

In [44]:
# Defining tuples explicitly...
(3, "second")

(3, 'second')

In [45]:
# ... and implicitly
3, 6

(3, 6)

Just like all sequences, tuples support both indexing and slicing.

In [46]:
(1, 2, 3)[:2]

(1, 2)

In [47]:
(1, 2, 3)[-1]

3

Tuples can be combined just like strings, with the `+` operator.

In [48]:
(1, 2) + (3, 4)

(1, 2, 3, 4)

Membership checking is also supported

In [49]:
3 in (1, 2, 3)

True

Unlike strings, however, we cannot check "subtuple" membership.

In [50]:
(2, 3) in (1, 2, 3)

False

You can nest tuples inside tuples.
This will also remain true for the collections we look at later too.

In [51]:
((1, 2), 3)[0][0]

1

It is also possible to convert other datatypes to tuples as long as they are sequences.

In [52]:
tuple("123")

('1', '2', '3')

Notice how in the process the string is split into "characters".

In [53]:
# Explicit
('123',)

('123',)

In [54]:
# Implicit
'123',

('123',)

### Lists

Lists are just like tuples except that you can modify them. More about that later.

In [55]:
[1, "3", (2, 4)]

[1, '3', (2, 4)]

Indexing works exactly like with the other sequences.

In [56]:
[1, 2, 3][0]

1

In [57]:
[1, 2, 3][2]

3

In [58]:
[1, 2, 3][4]

IndexError: list index out of range

In [59]:
[1, 2, 3, 4][::2]

[1, 3]

Combining lists also works.

In [60]:
[1, 2] + [1]

[1, 2, 1]

So does membership testing!

In [1]:
1 in [1, 2, 3]

True

Converting to a list works exactly like with tuples:
- other collections are converted using `list`
- singleton instances are enclosed in square brackets

In [62]:
print(list("abc"))
print(["abc"])

['a', 'b', 'c']
['abc']


### Sets

Sequences have one weakness: searching for item is proportional to sequence length.

Moreover, sometimes we need to ensure each item in our collection occurs only once.

Enter the `set` datatype.

In [63]:
{1, 2, 3, 3}

{1, 2, 3}

In [64]:
1 in {1, 2, 3}

True

How do we achieve uniqueness and constant speed lookup?

The answer: *hash tables*

*tldr about hash tables: basic hashing, only works on immutable objects*

In [65]:
{[1,2], 4}

TypeError: unhashable type: 'list'

Sets cannot be combined like sequences with `+`.

In [66]:
{1, 2, 3} + {1, 2}

TypeError: unsupported operand type(s) for +: 'set' and 'set'

We have to call either the `union` or ` intersection` methods instead.

In [67]:
{1, 2, 3}.intersection({1, 2})

{1, 2}

In [68]:
{1, 2, 3}.union({3, 4})

{1, 2, 3, 4}

However the difference of two sets can be computed with just the `-` operator.

In [69]:
{1, 2, 3} - {1, 2}

{3}

You can turn any sequence into a set by calling `set` on it.

In [70]:
set("string")

{'g', 'i', 'n', 'r', 's', 't'}

In [71]:
set([1, 2, 3])

{1, 2, 3}

### Dictionaries (Mappings)

Think "sets with values".

In [72]:
{"key": "value", 4: 8}

{'key': 'value', 4: 8}

Dictionaries are hash tables. Any hashable object can be a key.

In [79]:
{"string": 'value', 3: 'value', (3, 2): "value"}

{'string': 'value', 3: 'value', (3, 2): 'value'}

In [80]:
{[1, 2, 3]: "value"}

TypeError: unhashable type: 'list'

Absolutely any object can be a value.

In [81]:
{"key1": [1, 2, 3], "key2": {4: 4, 3: 3}}

{'key1': [1, 2, 3], 'key2': {3: 3, 4: 4}}

Since the primary use of dictionaries is to map keys to values, we need a way to retrieve the value for a given key.

In [73]:
{"key": "value", 4: 8}.get("key")

'value'

If the key is missing from the dictionary, we get `None`.

In [74]:
print({"key": "value", 4: 8}.get("alice"))

None


If we want a different value to be returned if we don't find a key, `get` accepts that as a second argument.

In [75]:
{"key": "value", 4: 8}.get("alice", 0)

0

Item access common enough, we need a shortcut!

In [76]:
{"key": "value", 4: 8}["key"]

'value'

In [77]:
{"key": "value", 4: 8}[4]

8

The square bracket notation is more fragile than `get`.

In [78]:
{"key": "value", 4: 8}["alice"]

KeyError: 'alice'

Dictionaries also have their own constructor which can be used to create a dictionary from a list of tuples.

In [82]:
dict([("key1", [1, 2, 3]), ("key2", {4: 4, 3: 3})])

{'key1': [1, 2, 3], 'key2': {3: 3, 4: 4}}

## Variable Assignment

Variable assignment syntax is pretty similar to other languages.
The right side is a name, left side is a value.

In [83]:
x = 3
x

3

Legal variable names:

can only contain numbers, letters, underscores,  cannot start with a number

In [84]:
# Valid
guest123 = "guest"

In [85]:
# Invalid
123guest = "guest"

SyntaxError: invalid syntax (<ipython-input-85-0c7ca4b73c81>, line 2)

In [86]:
guest@ = "guest"

SyntaxError: invalid syntax (<ipython-input-86-80efe5afa24e>, line 1)

### The Golden Rules

- **Names refer to values**
- **Variable assignment *never* copies data!**

The global "namespace" is just a dictionary!

In [87]:
x = 4

These two are equivalent:

In [88]:
globals()['x']
x

4

*The point is: "x" is just a name!*

In [89]:
x = 4
y = x

All this does is associate a second name with the object *4*.

In [90]:
x = 5

What value does `y` refer to now?

In [91]:
y

4

What value does `y` refer to in the code below?

In [92]:
x = [1, 2, 3]
y = x
x.append(4)

In [93]:
y

[1, 2, 3, 4]

*The point is: assignment never copies data!*

You can also think of indices in sequences as names referring to values. Consider this example:

In [94]:
t = ([1, 2], 3)
t[0].append(4)

What does **t** refer to now?

In [95]:
t

([1, 2, 4], 3)

### Variable Scope

This simply refers to where Python searches for the name when resolving a variable.

In [96]:
x = 10

def foo():
    print(x)

In [97]:
foo()

10


In [98]:
x = 10

def foo():
    print(x)
    x += 1

In [99]:
foo()

UnboundLocalError: local variable 'x' referenced before assignment

In [100]:
x = 10

def foo():
    global x
    print(x)
    x += 1

In [101]:
foo()

10


### Unpacking Sequences

Build a URL string in the format `"protocol://hostname:port"` from a tuple.

In [102]:
def to_URL(url_info):
    return url_info[0] + "://" + url_info[1] + ":" + url_info[2]

Here's another...

In [103]:
def to_URL(url_info):
    protocol, hostname, port = url_info
    return protocol + "://" + hostname + ":" + port

Which one is clearer?

Things are even more exciting in Python 3!

In [104]:
items = [1, 2, 3, 4]
a, *b = items
print(a)
print(b)

1
[2, 3, 4]


In [105]:
*c, d = items
print(c)
print(d)

[1, 2, 3]
4


In [106]:
e, *f, g = items
print(e)
print(f)
print(g)

1
[2, 3]
4


## Control Flow

*recapitulate what we have learned so far:*

- *create and manipulate basic data structures*
- *assign data to variables*
- *write functions*

*We are however still missing quite a bit of functionality.*

### Conditionals

In [107]:
# Conditionals 
if 3 < 4:
    print("Phew!!")
else:
    print("What?!?")

Phew!!


You can check the "truthiness" of more than just booleans

In [108]:
# You can check the "truthiness" of more than just booleans!
if [1, 2, 3]:
    print("list isn't empty")
if 9:
    print("integer is not zero")
if 0:
    print('integer is zero')

list isn't empty
integer is not zero


In [109]:
# Try removing `not` from the conditional
if not 0:
    print("integer is zero")

integer is zero


Python also has `and` and `or` for combining boolean expressions together.
They behave as you would expect them to.

#### Exercise: FizzBuzz
Write a function that accepts an integer and prints it. However, for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.

#### Conditional Assignment

Remember the section on assignment?

In [110]:
x = None
y = 3 if x is None else 5
y

3

In [111]:
# This is equivalent
x = None
y = 3 if not x else 5
y

3

### Looping

In [112]:
stop = 5
counter = 1
while counter < stop:
    print("hello")
    counter = counter + 1

hello
hello
hello
hello


Given the tools we have currently, how would we print all items in a collection?

In [113]:
def print_all(collection):
    index = 0
    while index < len(collection):
        print(collection[index])
        index += 1

In [114]:
print_all([1, 2, 3])

1
2
3


This can't handle dictionaries and sets. Let's fix that!

In [115]:
def print_all(collection):
    if isinstance(collection, set):
        collection = list(collection)
    elif isinstance(collection, dict):
        collection = list(collection.keys())
    index = 0
    while index < len(collection):
        print(collection[index])
        index += 1

In [118]:
print_all([1, 2, 3])

1
2
3


In [118]:
print_all((1, 2, 3))

1
2
3


In [116]:
print_all({2, 3, 1})

1
2
3


In [117]:
print_all({"a": 2, "b": 3, "c":1})

a
b
c


This is difficult to read *and* inefficient: we have to create a new list every time!

Let's use our secret weapon: **wishful thinking**

In [119]:
def print_all(collection):
    iterable = loop_over_me(collection)
    while has_more_items(iterable):
        print(get_next_item(iterable))

Python to the rescue!

`iter` turns any collection into something we can loop over

In [120]:
iter([1, 2, 3])

<list_iterator at 0x7f5f445927f0>

In [121]:
iter((1, 2, 3))

<tuple_iterator at 0x7f5f44592a90>

In [122]:
iter({"a": 1, "b": 2})

<dict_keyiterator at 0x7f5f4452c188>

Its pal `next` gets the next element from an iterable.

In [123]:
a = iter([1, 2, 3])
next(a)

1

In [124]:
next(a)

2

In [125]:
b = iter({1, 2, 3})
next(b)

1

Let's put our new friends `iter` and `next` into our `print_all` function!

In [126]:
def print_all(collection):
    iterable = iter(collection)
    while has_more_items(iterable):
        print(next(iterable))

How do we make sure we stop when the collection has run out of items?

*Check this out*

In [127]:
a = iter([1, 2])
next(a)

1

In [128]:
next(a)

2

In [129]:
next(a)

StopIteration: 

This is a clue!

In [130]:
def print_all(collection):
    iterable = iter(collection)
    while True:
        try:
            print(next(iterable))
        except StopIteration:
            break

In [131]:
print_all([1, 2, 3])

1
2
3


In [132]:
print_all({1, 2, 3})

1
2
3


In [133]:
print_all({"a": 1, "b": 2, "c": 3})

a
b
c


Do we have to re-write the whole exception-catching business to do something different than print?

Thank God no. Python's `for`-loop does the job for us!

In [134]:
for item in [1, 2, 3]:
    print(item)

1
2
3


In [135]:
for item in {1, 2, 3}:
    print(item)

1
2
3


Why did we have to derive something that's built into the language?

Looping in python is much more flexible and general than just going over elements of a list.

Anything that can be passed to `iter` can be looped over.

This opens up a world of possibilities.

Want to loop over a large range of numbers without loading all of them into memory at once? Use the `range` function.

In [136]:
for n in range(1,4):
    print(n)

1
2
3


Want to loop over both items and their positions in the sequence?

In [137]:
colors = ['red', 'green', 'blue', 'yellow']
for i, c in enumerate(colors):
    print(i, c)

0 red
1 green
2 blue
3 yellow


Both `range` and `enumerate` return something "iterable" without creating anything in memory.

And they say Python is inefficient :)