# Data structures and Functions

1. 
* Mutable and immutable types
* String
* List
* Tuple (and named tuple)
* Set (+frozen set)  
* Dict    
* File-like objects

2. 
* Functions

* Arguments

* Namespaces

* Scopes (LEGB)

* Enclosing

### Data
facts representation, terms or instructions in format suitable for processing.

### Data structures
is a data organization, management, and storage format that enables efficient access and modification


#### A mutable object can be changed after it's created, and an immutable object can't. 

<img src="images/datatypes.png" width="2500" height="1500">

## String
Ordered chars sequence, used to store and represent text information. 

Starting from  Python 3.x strings Unicode chars sequence.

## String literals

In [5]:
print("python" == 'python')

print("we don't miss Python lectures")
# print('we don't miss Python lectures')

True
we don't miss Python lectures


In [133]:
print('Guido van Rossum: "Python is an experiment in how much freedom programmers need."')
# print("Guido van Rossum: "Python is an experiment in how much freedom programmers need."")

SyntaxError: invalid syntax (<ipython-input-133-ad60d5e93d2b>, line 2)

In [1]:
a = "cat " \
    "dog"

print(a)
print("cat"" dog")
print("cat"                      " dog")

cat dog
cat dog
cat dog


## Multiline strings

In [39]:
big_1 = """a very very
... very big
... string"""

big_2 = """
a very very
much bigger
than normal one
"""

print(big_1)
print()
print(big_2)

a very very
very big
string


a very very
much bigger
than normal one



# Escape Characters
To insert characters that are illegal in a string, use an escape character.


In [8]:
problem = 'C:\teeeeeeext.txt'
no_problem = r'C:\teeeeeeext.txt'
no_problem_2 = 'C:\\teeeeeeext.txt'

# print(problem, no_problem, no_problem_2, sep='\n')

C:	eeeeeeext.txt
C:\teeeeeeext.txt
C:\teeeeeeext.txt


![image.png](attachment:image.png)

# Character encoding

__ASCII__ (American Standard Code for Information Interchange) - a 7-bit character code where every single bit represents a unique character. 

__Unicode__ -  is a specification that aims to list every character used by human languages and give each character its own unique code.

__Byte char representation__

- Unicode Transformation Format (UTF-8, UTF-16, UTF-32) - N __bit__ for each char
- Universal Character Set (UCS-2, UCS-4) - N __byte__ for each char



## Flexible String Representation

Python 3.3 (PEP-393)

- If string contains only ASCII, each char is represented by 1 byte. (UCS-1)

- If max code of any string char is more than 2^16, use UCS-2.

- In other cases - UCS-4.

https://www.python.org/dev/peps/pep-0393/


In [3]:
var = "Ы"

print(ord(var))
print(chr(ord(var)))

1067
Ы


In [9]:
char = "\N{LATIN SMALL LETTER SHARP S}"
char

'ß'

In [10]:
char.upper()

'SS'

In [11]:
char.upper().lower()

'ss'

## ascii(), str(), repr()

### Common

Methods to get objects string representation


### What's the difference

__repr__ -  “official” string representation of an object

__str__ - nicely printable string representation of an object -- humanreadable

__ascii__ - like repr,  but change non-ASCII symbols to escape characters


In [140]:
import datetime
today = datetime.datetime.now() 

print("str() -", str(today),
      "\nrepr() -", repr(today))

mot = "Les garçons"
print("\nrepr() -", repr(mot),
      "\nascii() -", ascii(mot)) 

str() - 2019-11-07 18:33:17.341678 
repr() - datetime.datetime(2019, 11, 7, 18, 33, 17, 341678)

repr() - 'Les garçons' 
ascii() - 'Les gar\xe7ons'


## Method - is an function associated with object and giving access to manipulate him



In [1]:
example = "Python is awesome" 
example.split('is')

['Python ', ' awesome']

##  find() - to get substring index


In [163]:
example.find("aw")

10

## If no match return -1

In [10]:
example.find("laaaaaaaa")

-1

## Find beginning from right end


In [2]:
example.rfind("some")

13

## Another options to find substring index

In [166]:
example.index("Py")

0

## But what's the difference between index and find?

In [167]:
example.index("Snake")

ValueError: substring not found

# Strings could be concatenated

In [17]:
first = "lenin"
second = "grib"
third = first + " " + second + " "
print(third)

# merge using given symbol

" ".join([first, second])


lenin grib 


'lenin grib'

### And duplicate

In [18]:
print(third*10)

lenin grib lenin grib lenin grib lenin grib lenin grib lenin grib lenin grib lenin grib lenin grib lenin grib 


## String slicing

In [19]:
example = '0123456'

print(example[:2])
print(example[:-1])
print(example[2:3])
print(example[3:-1])

01
012345
2
345


# Can we mutate string?

In [20]:
# Nope. Only create new object
third.replace("grib ", "molodec!")

'lenin molodec!'

# Trim whitespace chars from the beginning and the end
_______________________

S.lstrip([chars])	- from begging

S.rstrip([chars])	- from end

S.strip([chars])	- both

In [21]:
user_input = '   \t My name is \t Vasya \n\n\n'

user_input.strip()

'My name is \t Vasya'

# Change cases

In [27]:
print(third)
print()
print(
    "Upper case -" + third.upper(),
    "Title -" + third.title(),
    "Lower case -" + third.lower(),
    "Capitalize -" + third.capitalize(),
    "Swapcase - " + third.swapcase(),
    sep="\n",
)

lenin grib 

Upper case -LENIN GRIB 
Title -Lenin Grib 
Lower case -lenin grib 
Capitalize -Lenin grib 
Swapcase - LENIN GRIB 


In [12]:
name = "vasya pupkin"

name.capitalize()

# how to capitalize both words?

'Vasya pupkin'

# Get some information about string
___________________________
S.isdigit()	 - consist only from digits

S.isalpha()	 - consist only from letter

S.isalnum()	- consist both from letters and digits
_________
S.islower()	- consist from chars in lowercase

S.isupper()	- consist from chars in uppercase

S.istitle()	- begins from capitalized char

___________

`S.isspace()`	- Check if all the characters in the text are whitespaces:
___________

`S.startswith(pattern)`	 - begins from pattern

`S.endswith(pattern)`	- ends with pattern

#### But why it is immutable..

In [13]:
s = 'blah blah'
s[2] = "A"

TypeError: 'str' object does not support item assignment

__but we can create new objects__

In [14]:
replace_result = s.replace("a", "A")

print(id(replace_result), id(s))
print(s, replace_result)

4634302320 4634401456
blah blah blAh blAh


# String formatting
________________________________


### Approach 1 - real oldschool % 

- difficult to read on big strings
- could lead to bugs

In [16]:
price = "cheap"
name = "Linus Torvalds"
"'Talk is %s. Show me the code.' ― %s" % (price, name)

"'Talk is cheap. Show me the code.' ― Linus Torvalds"

### Approach 2 - .format()
_______________________________
- from Python 2.6
- much readable than %-formatting

In [17]:
"'Talk is {}. Show me the code.' ― {}".format(price, name)

"'Talk is cheap. Show me the code.' ― Linus Torvalds"

In [18]:
"'Talk is {1}. Show me the code.' ― {0}".format(name, price)

"'Talk is cheap. Show me the code.' ― Linus Torvalds"

In [19]:
data = {"price": "cheap", "name": "Linus Torvalds"}
"'Talk is {price}. Show me the code.' ― {name}".format(**data)

"'Talk is cheap. Show me the code.' ― Linus Torvalds"

### Approach 3 - just one simple.. great magnificent f 
- from Python 3.6 
- fastest approach (1.4 times faster than .format())
- PEP 498
<img src="images/f.jpeg" width="300" height="300">

In [20]:

f"'Talk is {price}. Show me the code.' ― {name}" 


"'Talk is cheap. Show me the code.' ― Linus Torvalds"

### Place expression into format brackets

In [19]:
f"ddddddddddddddddd{2 * 46}"

'ddddddddddddddddd92'

### what we have leaned
- string data type
- string encodings
- string method
- string formatting

![image.png](attachment:image.png)

# List

### Lists are mutable sequences, typically used to store collections of homogeneous items (where the precise degree of similarity will vary by application). 

dynamic array is under the hood.

### Lists may be constructed in several ways:

can contain any number of objects

In [189]:
snake = list('hat')
noble_programmer = ["Knuth", "Tanenbaum", "bike"]
matrix = [[1, 2], [3, 4]]
hungry_list = []

### Important: when creating a list with default value -- values are not copied

In [34]:
matrix = [[0]] * 3
matrix[0][0] = 'Null'

# what is inside the matrix now?

In [35]:
matrix 

[['Null'], ['Null'], ['Null']]

In [38]:
matrix = [[0] for i in range(3)]
matrix[0][0] = 'Null'
matrix

[['Null'], [0], [0]]

## Main methods:

### Add new elements

- __append__ and __extend__ - add one element or sequence to the end

In [144]:
l = [1, 2, 3]
l.append([4])
l.extend([5, 6])
l  

[1, 2, 3, [4], 5, 6]

- past element to specific place __insert__

In [90]:
l = [1, 2, 3]
l.insert(0, 'Null')
print(l)
l.insert(-1, 'some')
print(l)

['Null', 1, 2, 3]
['Null', 1, 2, 'some', 3]


### Alter list
- change sequence:

In [147]:
l = [1, 2, 3]
l[:2] = [5] * 2
l[0] = 1
l

[1, 5, 3]

### Concatenate lists

In [195]:
l1 = [1, 2]
l2 = [3]
print(id(l1), id(l2))
print(id(l1 + l2))

139803672921352 139803672915400
139803672920968


__Inplace__ concatenation 

In [196]:
l1 = [1, 2]
l2 = [3]
print(id(l1), id(l2))
l1 += l2 
print(id(l1))

139803672922696 139803673082504
139803672922696


### Remove elements
- remove element or sequence from list -- __del__ using index

In [148]:
l = [1, 2, 3]
del l[:2]
print(l)
del l[:]
print(l)
del l
l

[3]
[]


NameError: name 'l' is not defined

- __pop__ to get deleted value

In [40]:
l = [1, 2, 3]
l.pop(0)
print(first, l)

lenin [2, 3]


- delete first occurrence

In [94]:
kebab = [1, 1, 3, 1]
kebab.remove(1)
kebab

[1, 3, 1]

__list.index(x, [start [, end]])__

returns first occurrence of x (searching from start to end)

In [201]:
l = [1, 2, 3]
l.index(1)

0

In [202]:
l = [1, 2, 3]
l.index(46)

ValueError: 4 is not in list

__list.count(x)__
Counts elements with value x

In [203]:
l = [1, 1, 1, 2, 3]
l.count(1)

3

In [204]:
l = [1, 1, 1, 2, 3]
l.count(4)

0

### How to reverse list?

In [21]:
l = [1, 2, 3]
l.reverse()
l

[3, 2, 1]

In [206]:
l = [1, 2, 3]
lr = l[::-1]
lr

[3, 2, 1]

In [23]:
l = [1, 2, 3]
lr = reversed(l)
print(lr)
list(lr)

<list_reverseiterator object at 0x114435910>


[3, 2, 1]

### How to sort list ?

In [25]:
l = [3, 1, 2]
l.sort()
l

[1, 2, 3]


[1, 2, 3]

In [26]:
l = [3, 1, 2]
l_sorted = sorted(l)
print(l, l_sorted)

[3, 1, 2] [1, 2, 3]


We can specify order and key for __sorted__  __sort__ 

In [41]:
l = [3, 4, 2, 1]
l.sort(key=lambda x: x % 2, reverse=False)
l

[4, 2, 3, 1]

### List comprehensions

List comprehensions provide a concise way to create lists. 

Logically identical to for loop.

A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses. 

In [50]:
a = []
for i in range(10):
    a.append(i)

lol = [i for i in range(10) if i%2==0]
a == lol
lol
a  

[0, 2, 4, 6, 8]

Nice alternative to __map__ and __filter__

In [214]:
list(map(lambda x: x ** 2,
         filter(lambda x: x % 2 == 1,
                range(10))))

[1, 9, 25, 49, 81]

In [27]:
# list comprehensions could be nested
[[j for j in range(5)] for i in range(5)]

[[0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4]]

In [152]:
matrix = [[1, 2, 3], [4, 5], [6, 7, 8, 9]] 
[val for sublist in matrix for val in sublist] 
from itertools import chain
list(chain(*matrix))

[1, 2, 3, 4, 5, 6, 7, 8, 9]

## Slicing

```
a[start:stop]  # items start through stop-1
a[start:]      # items start through the rest of the array
a[:stop]       # items from the beginning through stop-1
a[:]           # a copy of the whole array
```

`:stop` the value the is not in the selected slice.

```
a[::-1]    # all items in the array, reversed
a[1::-1]   # the first two items, reversed
a[:-3:-1]  # the last two items, reversed
a[-3::-1]  # everything except the last two items, reversed
```

The slice object is used to slice a given sequence (string, bytes, tuple, list or range) or any object which supports sequence protocol (implements __getitem__() and __len__() method).

slice(stop)
slice(start, stop, step)

In [155]:
employee = ['Vasya', 'Pupkin', 'senior', '300k/ns']
NAME = slice(2)
POSITION = slice(1, 3)
SALARY = slice(3, None)
print(employee[NAME])
print(employee[POSITION])
print(SALARY)
NAME

['Vasya', 'Pupkin']
['Pupkin', 'senior']
slice(3, None, None)


slice(None, 2, None)

### Now we know:

* how to initialize lists
* lists method
* some of build-in functions
* Slicing

# Tuple

A tuple is a collection which is ordered and unchangeable. In Python tuples are written with parentheses.

In [28]:
point = (1, 2)
date = ('may', 22)

same_point = 1, 2
same_date = 'may', 22

One-element tuple should be defined with parentheses

In [30]:
# bad - could lead to bugs
t = 'some',
t

('some',)

Is almost the same as list. 
But why we need it?

 * Safe us from sequence mutating

 * Usually faster than list

 * Consume fewer space

 * Can be dictionary key

In [161]:
lst = [10, 20, 30]
tpl = (10, 20, 30)
print(lst.__sizeof__())
print(tpl.__sizeof__())

64
48


Tuples are compared position by position: the first item of the first tuple is compared to the first item of the second tuple; if they are not equal, this is the result of the comparison, else the second item is considered, then the third and so on. 

In [158]:
(1,2,3) < (1,2,4)


True

In [224]:
(1, 2, 3, 4) < (1, 2, 4)

True

Can be concatenated using __+__ operator. New object is created.

### Reverse tuple

In [225]:
tuple(reversed((1, 2, 3)))

(3, 2, 1)

In [226]:
(1, 2, 3)[::-1]

(3, 2, 1)

### Why we need 'extra' function reversed?

* slice always creates a copy
* `reversed` is an iterator: can accept arbitrary types in memory efficient way. O(1)

### enumerate

In [57]:
my_list = ['apple', 'banana', 'grapes', 'pear']
for c, value in enumerate(my_list, 10):
    print(c, value)

10 apple
11 banana
12 grapes
13 pear


Use iteration over object instead of loops with counter. If you need index -- use enumerate

In [31]:
# bad
for i in range(len(xs)) :
    x = xs[i]

# better
for x in xs:
    ...

# or
for i, x in enumerate(xs):
    ...

NameError: name 'xs' is not defined

#### [build-in functions](https://docs.python.org/3/library/functions.html)

### tuple vs list

__list__
* Mutable

* slower

* homogeneous sequence

* Unhashable


__tuple__
* Immutable

* faster

* heterogeneous sequence

* Hashable


![image.png](attachment:image.png)

### hashable

An object is said to be hashable if it has a hash value that remains the same during its lifetime. It has a `__hash__()` method and it can be compared to other objects. For this, it needs the `__eq__()` or `__cmp__()` method.
User-defined objects implementing `__hash__` by default (return object id).

In [229]:
a = (1, 2, 3)
c = (1, 2, 3)
a == c

True

In [230]:
a.__hash__() == c.__hash__()

True

In [231]:
id(a) != id(c)

True

![image.png](attachment:image.png)

In [41]:
hash(-1) == hash(-2)

True

#### What would be in the dict?

In [32]:
d = {
    1: '1',
    1.0: '1.0',
    True: 'True'
}

In [33]:
d

{1: 'True'}

## namedtuple

Named tuples are basically easy-to-create, lightweight object types. Named tuple instances can be referenced using object-like variable dereferencing or the standard tuple syntax. They can be used similarly to `struct` or other common record types, except that they are immutable.

In [162]:
from collections import namedtuple

Car = namedtuple('Car' , 'color mileage')
Cat = namedtuple('Cat' , 'color')

my_car = Car('red', 3812.4)
my_cat = Cat("white")
my_car + my_cat

('red', 3812.4, 'white')

In [235]:
my_car.color

'red'

Can use indexes:

In [236]:
my_car[0]

'red'

### Has a few nice methods

In [237]:
my_car._asdict()

OrderedDict([('color', 'red'), ('mileage', 3812.4)])

### `_replace()` method creates shallow copy in order to modify some of attributes

In [238]:
my_car._replace(color='blue')

Car(color='blue', mileage=3812.4)

### `_make()` creates new instance from sequence

In [239]:
Car._make(['red', 999])

Car(color='red', mileage=999)

## Multiple assignment and unpacking

In [240]:
x, y = 10, 20

### Same result but different spelling

In [241]:
x, y = 10, 20
x, y = (10, 20)
(x, y) = 10, 20
(x, y) = (10, 20)

Can be used with any iterable object

In [242]:
x, y = [10, 20]
x

10

In [243]:
x, y = 'hi'
x

'h'

Works with any number of objects and even with variables

In [244]:
point = 10, 20, 30
x, y, z = point
print(x, y, z)
(x, y, z) = (z, y, x)
print(x, y, z)

10 20 30
30 20 10


Can use asterisk:

In [245]:
first, *_, last = range(4)
first, last

(0, 3)

In [246]:
print(*[1], *[2], 3)
{*range(4), 4}

1 2 3


{0, 1, 2, 3, 4}

In [247]:
first, *middle, last = range(4)
middle, type(middle)

([1, 2], list)

### Unpacking to the asterisk is always [list](https://www.python.org/dev/peps/pep-3132/#acceptance)

## The same way it works with double asterisk **

New keys will overwrite old ones

In [248]:
dict(**{'x': 1}, y=2, **{'z': 3})

{'x': 1, 'y': 2, 'z': 3}

In [165]:
{'x': 1, **{"x": 3}, **{'x': 2}}

{'x': 2}


## Set
__unordered__ collections of unique hashable objects


### When to use?
1. remove duplicates
2. Membership testing
3. count intersections, union, difference, symmetric difference


#### NB!
- No indexing and slicing
- No ordered insertion

### How to initialize?

In [34]:
container = {1, 2, 3}
for i in container:
    print(i)

1
2
3


In [37]:
new_set = {i for i in range(5)}
new_set

{0, 1, 2, 3, 4}

No literals for empty set:

In [35]:
this_is_not_a_set_sorry = {}
type(this_is_not_a_set_sorry)

dict

But we can create one with elements inside it

In [36]:
brand_new_set = {"youth", "of", "the", "nation"}
print(brand_new_set, id(brand_new_set))
brand_new_set.add("me")
print(brand_new_set, id(brand_new_set))

{'of', 'the', 'youth', 'nation'} 4634879712
{'me', 'of', 'the', 'youth', 'nation'} 4634879712


What will happen if we pass iterable to the constructor?

In [38]:
s = set("privet")
s

{'e', 'i', 'p', 'r', 't', 'v'}

## Set methods

### Returns True or False:

set.isdisjoint(another_set) - True, if set and another_set have no common elements.

set.issubset(another_set) - True, if all set elements are in another_set.

set.issuperset(another_set) - True, if all another_set's elements belong to set.


### Set theory:

set.union(set1, ...) - set's union.

set.intersection(set1, ...) - set's intersection.

set.difference(other, ...) - set of elements, that are not in any of other.

set.symmetric_difference(other) - all elements, that are not in set and in any of other at the same time.


### Mutate set

set.add(elem) - add new elements.

set.remove(elem) - remove element from the set. __KeyError__ if no such element.

set.discard(elem) - removes element if is in set.

set.pop() - removes occurrence and returns it

set.clear() - remove all elements

## frozenset - immutable set 

we basically remove all methods that can mutate object and call it frozen.

In [255]:
frozen = frozenset("diplome")
frozen.add("water")

AttributeError: 'frozenset' object has no attribute 'add'

#### Summary

- We know what is the set and how to use it.

# dict -- ordered collection with access using key 

### How to initialize dict?

## 1. Using literals

In [256]:
dictionary = {}
# or
dictionary = {'West World': 8.4, 'True Detective': 7.6}
dictionary

{'West World': 8.4, 'True Detective': 7.6}

## 2. function dict()

In [257]:
dictionary = dict()
# or
dictionary = dict([(1, 100), (2, 400)])
dictionary

{1: 100, 2: 400}

## 3. Method .fromkeys()

In [258]:
dictionary = dict.fromkeys(
    ["Leonardo", "Donatello", "Raphael", "Michelangelo"], 
    "ninja"
)
dictionary

{'Leonardo': 'ninja',
 'Donatello': 'ninja',
 'Raphael': 'ninja',
 'Michelangelo': 'ninja'}

## 4. dict-comprehension

In [41]:
dictionary = {number: number**2 for number in range(0, 7, 2)}
dictionary

{0: 0, 2: 4, 4: 16, 6: 36}

## Key could be
- unique object
- hashable

## Dict methods

In [260]:
[type(i) for i in dictionary.items()]

[tuple, tuple, tuple, tuple]

`dict.items()` - returns pair (key, value).

`dict.keys()` - returns key from dict.

`dict.values()` - returns values from dict.

Returns value for the key. If no such key in dict create new with given value (default is None)


In [42]:
dictionary.setdefault("new")
dictionary.setdefault("new_2", 'new_2')
dictionary

{0: 0, 2: 4, 4: 16, 6: 36, 'new': None, 'new_2': 'new_2'}

Returns key value. If no such key in dict -- returns default (by default is None)

In [43]:
dictionary.get("smth", "other")

'other'

__dict.pop(key[, default])__ - remove key and returns value. If no key returns default (by default throws KeyError exception)

__dict.popitem()__ - removes and returns pair (key, value). If dict is empty, throws KeyError.


__dict.update([other])__ - updates dict by adding new pairs (key, value). Existing keys are updating. Is inplace operation

### Now we know

- How to create new dictionary
- who can be a key in dict
- Dict methods

## Time complexity

List

![image.png](attachment:image.png)

Dict

![image.png](attachment:image.png)

Set - Cpython implementation is very similar to dict

![image.png](attachment:image.png)

# Files, File-like object and how to use them

### Function open

Text and binary data is very differ in Python

Function open creates new File object. Accept one required argument -- path to the file.

In [263]:
open('./data/test.txt')

<_io.TextIOWrapper name='./data/test.txt' mode='r' encoding='UTF-8'>

open have a lot of argument - a few important:
* __mode__ - how to open file:
  * "r", "w", "x", "a", "+"
  * "b", "t".
* for text files:
  * __encoding__

In [264]:
                  | r   r+   w   w+   a   a+
------------------|--------------------------
read              | +   +        +        +
write             |     +    +   +    +   +
write after seek  |     +    +   +
create            |          +   +    +   +
truncate          |          +   +
position at start | +   +    +   +
position at end   |                   +   +

SyntaxError: invalid syntax (<ipython-input-264-c0fa45e7d330>, line 1)

### Methods to works with files:

#### Reading

Method __read__ reads no more than n symbols from file

In [142]:
file_handle = open('./data/test.txt')
file_handle.read(7)

'line1\nl'

__readline__ and __readline__ reads line or all lines from file.


In [143]:
file_handle = open('./data/test.txt')
print(len(file_handle.readline()))

file_handle.readlines()

6


['line2\n', 'line3\n']

### Write

In [128]:
file_handle = open("./data/example.txt", "w")
file_handle.write('someinformation')

15

Write sequence of lines to the file. Remember to add new line char `\n`

In [129]:
file_handle.writelines(['spam', 'egg'])
file_handle.close()
open("./data/example.txt", "r").readlines()

['someinformationspamegg']

И еще немного методов работы с файлом

In [130]:
file_handle = open("./data/example.txt", "r+")
file_handle.fileno() # file descriptor

44

In [123]:
file_handle.tell() # file object’s current position in bytes

0

In [126]:
file_handle.seek(8)
file_handle.tell()

ValueError: I/O operation on closed file.

In [156]:
file_handle.write("something unimportant")
file_handle.flush() # Flush the write buffers of the stream
file_handle.close()

### Remember to close file! Always!

And to do it in convenient way:

In [157]:
with open("./data/example.txt", "r+") as ouf:
    ...
    # do your stuff here and dont worry about file closing


In [45]:
from io import StringIO

file_like = StringIO('line1\nline2\nline3\nline4\nline5\n')

print(file_like.readlines())

['line1\n', 'line2\n', 'line3\n', 'line4\n', 'line5\n']


### What we have learned
* how to open file
* how to work with file
* what is file-like object

![image.png](attachment:image.png)

## Functions

In [16]:
def funny_function():
    return 'to_the_blue_lagoon'

A function definition is an executable statement. Its execution binds the function name in the current local namespace to a function object (a wrapper around the executable code for the function). This function object contains a reference to the current global namespace as the global namespace to be used when the function is called.

In [17]:
funny_function()

'to_the_blue_lagoon'

In [18]:
funny_function

<function __main__.funny_function()>

In [20]:
dir(funny_function)

['__annotations__',
 '__call__',
 '__class__',
 '__closure__',
 '__code__',
 '__defaults__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__get__',
 '__getattribute__',
 '__globals__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__kwdefaults__',
 '__le__',
 '__lt__',
 '__module__',
 '__name__',
 '__ne__',
 '__new__',
 '__qualname__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

### Naming constrains
- latin letters
- underscore `_`
- digits 0-9, __but no at the begging!__

In [21]:
def 1foo():
    pass

SyntaxError: invalid syntax (<ipython-input-21-735460777140>, line 1)

__return__ could be omitted. Be default returns None

In [140]:
def foo():
    pass

print(foo())

None


In [16]:
print(print(foo()))

None
None


Functions could have several __return__

In [26]:
def never_gonna(what):
    if what == 1:
        return 'give you up'
    if what == 2:
        return 'let you down'
    return 'run around and desert you'
    return "You wouldn't get this from any other guy"
    
print(never_gonna(1))
print(never_gonna(10))

give you up
run around and desert you


For documentation use multi-line string literals

In [29]:
def creep():
    """I wish I was special"""
    return "sorry"

How to reveal documentation?

In [30]:
creep.__doc__


'I wish I was special'

In [31]:
help(creep)

Help on function creep in module __main__:

creep()
    I wish I was special



In [9]:
?help

In [33]:
from inspect import getdoc

getdoc(creep)

'I wish I was special'

### Arguments

#### Positional arguments

In [28]:
def avg(a, b):
    return (a+b)/2

avg(10, 9)

9.5

#### Keyword arguments

In [46]:
def order_an_ice_cream(scoop, toping="syrup", flavor="chocolate"):
    return "{} scoop(s) with {} and {} toping".format(
        scoop, flavor, toping
    )


print(order_an_ice_cream(10))
print(order_an_ice_cream(3, "nut", "strawberries and bananas"))
print(order_an_ice_cream(scoop=1, toping="KETCHUP??!?!?", flavor="vanilla"))

10 scoop(s) with chocolate and syrup toping
3 scoop(s) with strawberries and bananas and nut toping
1 scoop(s) with vanilla and KETCHUP??!?!? toping


In [24]:
order_an_ice_cream(3, toping="nut", "strawberries and bananas")

SyntaxError: positional argument follows keyword argument (<ipython-input-24-1eaced7ef92e>, line 1)

### Default arguments initialization

In [49]:
def foo(a, lst=[]):
    lst.append(a)
    return lst

print(foo(1))
print(foo(2))
print(foo(3))
print(foo(1, ['q']))
print(foo(1))

[1]
[1, 2]
[1, 2, 3]
['q', 1]
[1, 2, 3, 1]


Question: when and how many times are default arguments initialized?

Default parameter values are evaluated from left to right when the function definition is executed

In [47]:
def foo(a, lst=None):
    #  lst = lst or []
    if lst is None:
        lst = []
        
    lst.append(a)
    return lst


print(foo(1))
print(foo(2))
print(foo(3))
print(foo(1, ['q']))
print(foo(2))

[1]
[2]
[3]
['q', 1]
[2]


### Packing

In [13]:
def avg(*args):
    return sum(args)/len(args)

In [14]:
a = [i for i in range(10)]
avg(*a)

4.5

In [39]:
avg()

ZeroDivisionError: division by zero

In [17]:
def foo_with_args(first, *args):
    print(type(args))
    arguments = (first,) + args
    print(arguments)
    return '; '.join(map(str, arguments))

print(foo_with_args(1, 2, 4))
foo_with_args(1)

<class 'tuple'>
(1, 2, 4)
1; 2; 4
<class 'tuple'>
(1,)


'1'

In [19]:
def foo_with_args_and_kwargs(first, *args, **kwargs):
    return [first, *args, *((k, v) for k,v in kwargs.items())]


print(*foo_with_args_and_kwargs(1, 10, 100))
print(*foo_with_args_and_kwargs(1, 10, 100, **{'param': True}))

some_dict = {'param': True, 'not_param': False}
print(*foo_with_args_and_kwargs(1, 10, 100, **some_dict))

1 10 100
1 10 100 ('param', True)
1 10 100 ('param', True) ('not_param', False)


### keyword only arguments


To mark parameters as keyword-only, indicating the parameters must be passed by keyword argument, place an * in the arguments list just before the first keyword-only parameter.

In [167]:
def obey_me(param, *, password=None):
    return param, f'password={password}'

print(obey_me(1, password='yeeea'))
obey_me(1, 'noooo')

(1, 'password=yeeea')


TypeError: obey_me() takes 1 positional argument but 2 were given

One more function attribute

In [169]:
obey_me.__kwdefaults__

{'password': None}

Since python 3.8:

Looking at this in a bit more detail, it is possible to mark certain parameters as positional-only. If positional-only, the parameters’ order matters, and the parameters cannot be passed by keyword. Positional-only parameters are placed before a `/` (forward-slash). The `/` is used to logically separate the positional-only parameters from the rest of the parameters. If there is no / in the function definition, there are no positional-only parameters.

Parameters following the `/` may be positional-or-keyword or keyword-only.



In [None]:
def f(pos1, pos2, /, pos_or_kwd, *, kwd1, kwd2):
      -----------    ----------     ----------
        |             |                  |
        |        Positional or keyword   |
        |                                - Keyword only
         -- Positional only

In [174]:
import operator


OPERATORS = {
    '+': operator.add,
    '-': operator.sub,
    '*': operator.mul,
    '/': operator.floordiv,
}


def calc_for_two_numbers(x, y, /, operator='+'):
    op = OPERATORS[operator]
    return op(x, y)


calc_for_two_numbers(2, 2, '*')
calc_for_two_numbers(x=2, y=2, operator='*')  # invalid, will raise a TypeError

4

### Callable

Call function means to call `__call__` method. `add(1, 2)` == `add.__call__(1, 2)`

* user-defined functions
* built-in functions 
* methods of built-in objects
* class objects
* methods of class instances
* all objects having a `__call__` method are callable

Defining function like ```def funcname(parameters):``` you actually creates new object with defined `__call__` method.

### Check if objects is `callable`

In [7]:
print(callable(len), callable(45), callable(callable))

True False True


### Function attributes

* `__doc__`

* The function’s documentation string, or None if unavailable; not inherited by subclasses.

* Writable

In [97]:
from inspect import cleandoc


def foo():
    """
    Basically this function does nothing. Body of this fucntion 
    cosist only of docstring. But this is a totally valid function.
    
    :returns: None
    """


foo()
print(foo.__doc__)
foo.__doc__ = cleandoc(foo.__doc__)
print(foo.__doc__)


    Basically this function does nothing, even has no body except 
    this docstring. But this is totally valid function.
    
    :returns: None
    
Basically this function does nothing, even has no body except 
this docstring. But this is totally valid function.

:returns: None



* `__name__`

* The function’s name.

* Writable

In [132]:
import uuid


def foo():
    ...

    
print(foo.__name__)
foo.__name__ = uuid.uuid4().hex
print(foo.__name__)
foo.__name__ = object()


foo
23217db8774c4ba5906f1b096c10e5e4


TypeError: __name__ must be set to a string object

* `__qualname__`

* The function’s qualified name.

* Writable

A dotted name showing the “path” from a module’s global scope to a class, function or method defined in that module, as defined in PEP 3155. For top-level functions and classes, the qualified name is the same as the object’s name:

In [104]:
def foo():
    def bar():
        return
    return bar


foo().__qualname__

'foo.<locals>.goo'

* `__module__`

* The name of the module the function was defined in, or None if unavailable.

* Writable

In [105]:
def foo():
    ...

    
print(foo.__module__)

__main__


In [106]:
from functools import partial


partial.__module__

'functools'

* `__code__`

* The code object representing the compiled function body.

* Writable

In [20]:
def foo():
    return 'return value'


eval(foo.__code__)
foo.__code__

<code object foo at 0x7fa18eeb4930, file "<ipython-input-20-f222ffa82bbc>", line 1>

* `__globals__`

* A reference to the dictionary that holds the function’s global variables — the global namespace of the module in which the function was defined.

* Read-only

![image.png](attachment:image.png)

### SCOPES

Namespaces – associating variables with objects. Hopefully without mistakes.

Namespace ~= dict
<img src='./images/namespaces.png' style='float: right;width:70%'>

<img src='./images/python_namespace.png' style='float:right;width:70%;height:70%'>

In [2]:
spam = 'spam and eggs'
eggs = spam
 
print(spam)  # spam and eggs
print(eggs)  # spam and eggs

print(id(spam)) 
print(id(eggs))

spam and eggs
spam and eggs
140714621322480
140714621322480


### Encapsulation and scoping

Enclosing – ability for function to use variables that not belongs to 

In [19]:
def spam():
    eggs = 'spam and eggs'    
    def cantine():
        print(eggs)
    cantine()
    
spam()


spam and eggs


In [21]:
def spam():
    print(eggs)
 
eggs = 'spam and eggs'
spam()  # spam and eggs

spam and eggs


In [1]:
def spam():
    eggs = 'spam and eggs'
    print(eggs)
 
spam()       # spam and eggs
print(eggs)  # raises a NameError exception

spam and eggs


NameError: name 'eggs' is not defined

### Initializing new object we do create new namespace

In [21]:
del eggs

In [149]:
class Meal:
    eggs = 2
  
my_meal = Meal()
print(my_meal.eggs)    # 2
print(eggs)            # raises a NameError exception

2


NameError: name 'eggs' is not defined

###### To see what is in object namespace == `__dict__`, or ` vars()`

In [56]:
vars(my_meal )

{'eggs': 2}

### LEGB

#### Name lookup is going no more than in 4 scopes: local -> enclosing -> global and built-id

<img src='./images/python_namespaces_legb.jpg' style='float: right'>


<div style='float:left;width:40%;font-size:25px'>
<b>Local</b>– Names which are assigned within a function.

<b>Enclosing</b> – Names which are assigned in a closure (function in a function)

<b>Global</b> – Names which are assigned at the top-level of a module, for example on the top-level of your Python file

<b>Built-in</b> – Names which are standard Python built-ins, such as open, import, print, return, Exception</div>

<img src='./images/python_namespaces_code.jpg' style='float:right;width:60%' >


In [23]:
global_var = 0

def func():
    var = 'variable'
    
    def print_vars():
        inner_var = 1 
        print('inner_var', inner_var) # local
        print('var', var) # enclosing
        print('global_var', global_var) # global
        print('func', func)
    print_vars()

func()

inner_var 1
var variable
global_var 0
func <function func at 0x7fa18f705c80>


In [30]:
from dis import dis
dis(func) 

  4           0 LOAD_CONST               1 ('variable')
              2 STORE_DEREF              0 (var)

  6           4 LOAD_CLOSURE             0 (var)
              6 BUILD_TUPLE              1
              8 LOAD_CONST               2 (<code object print_vars at 0x7fe4a4352420, file "<ipython-input-29-9aac0c951ca1>", line 6>)
             10 LOAD_CONST               3 ('func.<locals>.print_vars')
             12 MAKE_FUNCTION            8
             14 STORE_FAST               0 (print_vars)

 12          16 LOAD_FAST                0 (print_vars)
             18 CALL_FUNCTION            0
             20 POP_TOP
             22 LOAD_CONST               0 (None)
             24 RETURN_VALUE

Disassembly of <code object print_vars at 0x7fe4a4352420, file "<ipython-input-29-9aac0c951ca1>", line 6>:
  7           0 LOAD_CONST               1 (1)
              2 STORE_FAST               0 (inner_var)

  8           4 LOAD_GLOBAL              0 (print)
              6 LOAD_CONST    

#### LEXING / TOKENIZING.

In [157]:
b = 6
def f1(a):
    print(a)
    print(b)

In [158]:
from dis import dis
dis(f1)

  3           0 LOAD_GLOBAL              0 (print)
              2 LOAD_FAST                0 (a)
              4 CALL_FUNCTION            1
              6 POP_TOP

  4           8 LOAD_GLOBAL              0 (print)
             10 LOAD_GLOBAL              1 (b)
             12 CALL_FUNCTION            1
             14 POP_TOP
             16 LOAD_CONST               0 (None)
             18 RETURN_VALUE


* Load global name print.
* Load local name a.
* Call print function with 1 positional argument. 
* Load global name b.
* Load constant, in which case there None.

#### To see what is in scope

In [None]:
glabal_var = 0

def func():
    var = 'variable'
    
    def print_vars(arg):
        inner_var = 1 
        print(locals()) # {'arg': 'argument', 'inner_var': 1}
        print(globals()) # {'__name__': '__main__', '__doc__' ..., 'glabal_var' : 0}
        
    print_vars('argument')

func()

Functions is Python can access variables that are not in current scope. 

Important to remember: lookup is happens while functions is called.


In [58]:
def f():
    print(i)

for i in range(5):
    f()

0
1
2
3
4


In [160]:
global_var = 0

def func():
    global_var = 1
    
print(global_var)
func()
print(global_var)

0
0


For assignment LEGB is not working

In [39]:
global_var = 0

def foo():
    global_var = global_var + 1

print(global_var)
foo()

UnboundLocalError: local variable 'global_var' referenced before assignment

Change default behavior __nonlocal__ and __global__


### global

To be able to assign new value to variable that is not in current scope:
use __global__ operator.


In [44]:
global_var = 0

def foo():
    global global_var
    global_var = global_var + 1

print(global_var)
foo()
print(global_var)

0
1


### nonlocal

Nonlocal namespace – function defined inside of another fucntion.

To be able to assign new value to variable that is enclosing scope:
use __local__ operator.


In [47]:
def f1():
    a = 1
    b = 2
    
    def inner(): 
        nonlocal a
        a = a + b
        
    inner()
    print('local a is', a)
    
f1()

local a is 3


### What need to remember:

1. Python have 4 scopes: local, enclosing, global, built-in

2. Name lookup is going from local to built-in. Using assignment operator name considered local.

3. This behavior could be changed by using global and nonlocal operators


#### Function annotation 

In [71]:
from typing import Union

def is_palindrome_very_much(
    s: Union[str, int], 
    variant: int,
    key: None = None
) -> bool:
    
    if variant == 1:
        return s == ''.join(reversed(s)) 
    if variant == 2:
        return s == s[::-1]
    return 'coose variant'

print(is_palindrome('madam', 1))
print(is_palindrome('madam', 2))

True
True


In [175]:
#  пасхалка

is_palindrome.__annotations__

{'s': typing.Union[str, int], 'variant': int, 'return': bool}

In [73]:
a: None = None
s: str = '123'

#### Naming and a few advices:

* Function should do only one thing (logically)
* If it takes time to scroll other function: something goes wrong
* Functions name should describe what is does in shorted manner
* Better long and comprehensive name than short and unclear
* Better use verb to name function

![image.png](attachment:image.png)

The nice meme for that I was not able to find an appropriate place. 
![image.png](attachment:image.png)