## Function Arguments and mutability

> video 21

Immutable objects (strings, tuples, numbers) are safe from unintended side-effects.

```python
def process(s):
    s = s + ' world'
    return s

my_var = 'hello' # my_var points to a string object in memory, e.g. in hex 0x1000
process(my_var) # s points to the same string object in memory, e.g. in hex 0x1000
-> 'hello world' # the function returns a new string object, e.g. in hex 0x2000
print(my_var) # my_var still points to the same string object in memory, e.g. in hex 0x1000
```

mutable objects (lists, dictionaries) are not safe from unintended side-effects.

```python
def process(lst):
    lst.append(100)

my_list = [1, 2, 3] # my_list points to a list object in memory, e.g. in hex 0x1000
process(my_list) # lst points to the same list object in memory, e.g. in hex 0x1000
# the function modifies the list object in memory, e.g. in hex 0x1000
print(my_list) # my_list still points to the same list object in memory, e.g. in hex 0x1000
-> [1, 2, 3, 100] # my_list has been modified
```

Immutable collection objects that contain mutable objects are not safe from unintended side-effects.

```python
def process(t):
    t[0].append(3)

my_tuple = ([1, 2], 'a') # my_tuple points to a tuple object in memory, e.g. in hex 0x1000
process(my_tuple) # t points to the same tuple object in memory, e.g. in hex 0x1000
print(my_tuple) # my_tuple still points to the same tuple object in memory, e.g. in hex 0x1000 but the list was modified
-> ([1, 2, 3], 'a') 
```

In [1]:
def process(s):
    print(f'Initial s # = {id(s)}')
    s = s + ' world'
    print(f'Final s # = {id(s)}')
    

In [2]:
my_var = 'hello'
print(f'my_var # = {id(my_var)}')

my_var # = 140679843141552


In [3]:
process(my_var)

Initial s # = 140679843141552
Final s # = 140679412791664


In [4]:
id(my_var)

140679843141552

In [5]:
my_var

'hello'

In [6]:
def mod_list(lst):
    print(f'Initial s # = {id(lst)}')
    lst.append(100)
    print(f'Final s # = {id(lst)}')

In [7]:
my_list = [1, 2, 3]

In [8]:
id(my_list)

140679400966784

In [9]:
mod_list(my_list)

Initial s # = 140679400966784
Final s # = 140679400966784


In [10]:
id(my_list)

140679400966784

In [11]:
my_list

[1, 2, 3, 100]

In [12]:
def mod_tuples(t):
    print(f'Initial t # = {id(t)}')
    t[0].append(100)
    print(f'Final t # = {id(t)}')

In [13]:
my_tuple = ([1, 2], 'a')

In [14]:
id(my_tuple)

140679402246144

In [15]:
mod_tuples(my_tuple)

Initial t # = 140679402246144
Final t # = 140679402246144


In [16]:
my_tuple

([1, 2, 100], 'a')

## Shared References and mutability

> video 22

Shared references is when multiple variables point to the same object in memory (they have the same memory address)

```python
my_list = [1, 2, 3]
my_list2 = my_list

def process(lst):
    lst.append(100)

process(my_list2)

# my_list, my_list2 and lst point to the same list object in memory, e.g. in hex 0x1000

Python re-use the same object in memory if it is immutable
```python
a = 10
b = 10 # both a and b point to the same integer object in memory, e.g. in hex 0x1000

s1 = 'hello'
s2 = 'hello' # both s1 and s2 point to the same string object in memory, e.g. in hex 0x2000
```
It is safe? Yes, because strings and numbers are immutable.

Mutable objects: python does not re-use the same object in memory if it is mutable
```python
a = [1, 2, 3]
b = a
b.append(100) # both a and b point to the same list object in memory, e.g. in hex 0x1000

print(a)
-> [1, 2, 3, 100]
```

In [17]:
a = 'hello'
b = a

In [18]:
id(a)

140679843141552

In [19]:
id(b)

140679843141552

In [20]:
a = 'hello'

In [21]:
b = 'hello'

In [22]:
hex(id(a))

'0x7ff29413d7b0'

In [23]:
hex(id(b))


'0x7ff29413d7b0'

In [24]:
b = 'hello world'

In [25]:
hex(id(b))


'0x7ff27a525cf0'

In [26]:
hex(id(a))


'0x7ff29413d7b0'

In [27]:
a

'hello'

In [28]:
b

'hello world'

In [29]:
# mutable example
l = [1, 2, 3]

In [31]:
t = l

In [32]:
t.append(100)

In [33]:
hex(id(l))

'0x7ff27a6d1440'

In [34]:
hex(id(t))

'0x7ff27a6d1440'

In [35]:
t

[1, 2, 3, 100]

In [36]:
l

[1, 2, 3, 100]

In [37]:
a = 10

In [38]:
b = 10

In [39]:
hex(id(a))

'0x7ff296900210'

In [40]:
hex(id(b))


'0x7ff296900210'

In [41]:
a = 500

In [42]:
b = 500

In [43]:
hex(id(a))


'0x7ff2798c7cd0'

In [44]:
hex(id(b))

'0x7ff2798c7fb0'

Python does not share references every time. 

## Variable equality

>video 23

1. Variable address equality: `a is b` (same memory address)
2. Object State equality: `a == b` (same data)

### Negation

1. Variable address inequality: `a is not b` or `not(a is b)` (different memory address)
2. Object State inequality: `a != b` (different data)


In [1]:
a = 10
b = a

In [2]:
a is b

True

In [3]:
a == b

True

In [4]:
a = 'hello'
b = 'hello'

In [6]:
a is b # true but not always

True

In [7]:
a == b

True

In [8]:
# mutable example
a = [1, 2, 3]
b = [1, 2, 3]

In [9]:
a is b

False

In [10]:
a == b

True

In [11]:
a = 10
b = 10.0

In [12]:
a is b

False

In [13]:
a == b

True

## The None object

The `None` object is a real object. The memory manager will always use a shared reference when assigning a variable to `None`.

```python
a = None
b = None
a is b # True
a is None # True
```

In [14]:
a = 10
b = 10

In [15]:
id(a)


139758112408080

In [17]:
id(b)

139758112408080

In [18]:
print('a is b', a is b)

a is b True


In [19]:
print('a == b', a == b)

a == b True


In [20]:
a = 500
b = 500

In [21]:
print('a is b', a is b)
print('a == b', a == b)

a is b False
a == b True


## Everything is an object

>video 24

Everything is an object, they are all instances of a class.

- Functions (`function`)
- Classes (`class`)
- Types (`type`)

This means they all have a `memory address`.

As a consequence, any object can be:

- assigned to a variable (including functions)
- passed toa a function (including functions)
- returned from a function (including functions) 

OBS:

```python
def my_func():
    ...
```

`my_func` is the *name* of the function.
`my_func()` is the *call*of the function, or *invokes* the function.


In [1]:
a = 10
print(type(a))

<class 'int'>


In [2]:
b = int(10)
print(type(b))

<class 'int'>


In [3]:
help(int)#documentation

Help on class int in module builtins:

class int(object)
 |  int([x]) -> integer
 |  int(x, base=10) -> integer
 |  
 |  Convert a number or string to an integer, or return 0 if no arguments
 |  are given.  If x is a number, return x.__int__().  For floating point
 |  numbers, this truncates towards zero.
 |  
 |  If x is not a number or if base is given, then x must be a string,
 |  bytes, or bytearray instance representing an integer literal in the
 |  given base.  The literal can be preceded by '+' or '-' and be surrounded
 |  by whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.
 |  Base 0 means to interpret the base from the string as an integer literal.
 |  >>> int('0b100', base=0)
 |  4
 |  
 |  Built-in subclasses:
 |      bool
 |  
 |  Methods defined here:
 |  
 |  __abs__(self, /)
 |      abs(self)
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __bool__(self, /)
 |      True if 

In [4]:
c = int()

In [5]:
c

0

In [6]:
c = int('101', base=2)

In [7]:
c

5

In [8]:
def square(a):
    return a ** 2

In [9]:
type(square)

function

In [10]:
f = square

In [11]:
id(f)

139855462919056

In [12]:
id(square)

139855462919056

In [13]:
f is square

True

In [14]:
f(2)

4

In [15]:
square(2)

4

In [16]:
def cube(a):
    return a ** 3

In [17]:
def sel_func(fn_id):
    if fn_id == 1:
        return square
    else:
        return cube

In [18]:
f = sel_func(1)

In [19]:
f is square

True

In [20]:
f is cube

False

In [21]:
f(2)

4

In [22]:
f = sel_func(2)


In [23]:
f is cube

True

In [24]:
f(3)

27

In [25]:
sel_func(2)(3)# first evaluate the first parameter 

27

In [26]:
def exec_func(fn, n):
    return fn(n)

In [27]:
exec_func(cube, 3)

27

In [28]:
exec_func(square, 3)

9

## Python optimizations - interning

>video 25

**CPython and version 3.6.**

### Interning

Reusing objects n-demand.

Python pre-loads (caches) a global list of integers in the range (-5, 256).

Any time an integer is referenced in that range, Python will use the cached version of that object

They are **Singletons**.: optimization strategy - small integers show up often.


In [29]:
a = 10

In [30]:
b = 10

In [31]:
id(a)

139855529755152

In [32]:
id(b)

139855529755152

In [33]:
a is b

True

In [34]:
a = 257
b = 257

In [35]:
a is b

False

## Python optimizations - string interning

>video 26

Some strings are also automatically **interned** - but not all!

As the Pytho  code is compiled, **indentifiers** are interned.

- variables names
- function names
- class names
- etc

Some string literals may also be automatically interned. **But don't count on it!**


Why? Optimization strategy: speed and memory.

```python
a = 'some_long_string'
b = 'some_long_string'
```

Using a == b, we need to compare the two strings character by character.

But if we know that `"some_long_string"` has ben interned, then a and b are the same string if they both point to the same memory address

In which case we case use `a is b`. This is much faster than `a == b`.

### Force string interning

```python
import sys

a = sys.intern('some_long_string')
b = sys.intern('some_long_string')
a is b # True
```

When should you to this?

- dealing with a large number of strings that could have high repetition. e.g. tokenizing a large corpus of the text (NLP)
- lotos of string comparisons.

In general though don't do it.

In [36]:
w1 = 'hello'

In [37]:
w2 = 'hello'

In [38]:
print(id(w1), id(w2))

139855468608368 139855468608368


In [40]:
w1 = 'hello world'

In [41]:
w2 = 'hello world'

In [42]:
print(id(w1), id(w2))

139855461706608 139855461707568


In [43]:
w1 is w2

False

In [44]:
w1 == w2

True

In [45]:
w1 = '_this_is_a_long_string_that_could_be_used_as_an_identifier'

In [46]:
w2 = '_this_is_a_long_string_that_could_be_used_as_an_identifier'


In [47]:
w1 is w2

True

In [48]:
#force identifier
import sys

In [49]:
w1 = sys.intern('hello world')

In [50]:
w2 = sys.intern('hello world')

In [51]:
w1 is w2

True

In [52]:
w3 = 'hello world'

In [53]:
print(id(w1), id(w2), id(c))

139855461806512 139855461806512 139855529754992


In [54]:
w1 == w2

True

In [55]:
def compare_equals(n):
    a = ' a long string is not interned'*200
    b = ' a long string is not interned'*200
    for i in range(n):
        if a == b:
            pass


In [56]:
def compare_interning(n):
    a = sys.intern(' a long string is not interned'*200)
    b = sys.intern(' a long string is not interned'*200)
    for i in range(n):
        if a is b:
            pass

In [57]:
import time

In [59]:
start = time.perf_counter()
compare_equals(100000000)
end = time.perf_counter()
print('equality', end-start)

equality 31.628700511995703


In [60]:
start = time.perf_counter()
compare_interning(100000000)
end = time.perf_counter()
print('equality', end-start)

equality 6.043594757997198


## Python optimizations - Peephole

>video 27 

Constant expressions

- numeric calculations: `24 * 60` if occurs many times, Python will pre-compute the result and replace the expression with the result.
- short sequences length < 20: `(1, 2) * 5` 


Membership Tests: mutables are replaced by immutable

- `x in [1, 2, 3]` is replaced by `x in (1, 2, 3)`: the `[1, 2, 3]` list is constant, so it's replaced by its immutable counterpart, `(1, 2, 3)` tuple.
- lists => tuples
- sets => frozensets

Set memberships is much faster than list membership. Sets are basically dictionaries.

- `x in {1, 2, 3}` is replaced by `x in frozenset({1, 2, 3})`

In [61]:
# examples
def my_func():
    a = 24 * 60
    b = (1, 2) *5
    c = 'abc' * 3
    d = 'ab' * 11
    e = 'the quick brown fox' * 5
    f = ['a', 'b'] * 3
    

In [62]:
my_func.__code__.co_consts

(None,
 1440,
 (1, 2, 1, 2, 1, 2, 1, 2, 1, 2),
 'abcabcabc',
 'ababababababababababab',
 'the quick brown foxthe quick brown foxthe quick brown foxthe quick brown foxthe quick brown fox',
 'a',
 'b',
 3)

In [63]:
def my_func(e):
    if e in [1, 2, 3]:
        pass

In [64]:
my_func.__code__.co_consts

(None, (1, 2, 3))

In [65]:
import string
import time

In [66]:
char_list = list(string.ascii_letters)

In [67]:
char_tuple = tuple(string.ascii_letters)

In [68]:
char_set = set(string.ascii_letters)


In [70]:
print(char_list)

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']


In [71]:
print(char_tuple)


('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z')


In [72]:
print(char_set)


{'B', 'a', 'N', 'Y', 'L', 'e', 'j', 'Z', 'b', 'd', 't', 'I', 'k', 'H', 'K', 'P', 'T', 'Q', 'w', 'x', 'W', 'n', 'U', 'p', 'E', 'i', 'h', 'V', 'l', 'r', 'v', 'y', 'A', 'G', 'J', 'm', 'z', 'D', 'R', 'q', 'S', 'o', 'F', 'X', 's', 'M', 'O', 'c', 'u', 'g', 'C', 'f'}


In [73]:
def member_test(n, container):
    for i in range(n):
        if 'z' in container:
            pass

In [75]:
start = time.perf_counter()
member_test(10000000, char_list)
end = time.perf_counter()
print('list', end - start)

list 10.462019730002794


In [76]:
start = time.perf_counter()
member_test(10000000, char_tuple)
end = time.perf_counter()
print('tuple', end - start)

tuple 7.0819098520005355


In [77]:
start = time.perf_counter()
member_test(10000000, char_set)
end = time.perf_counter()
print('set', end - start)

set 0.8268796039992594
