# Agenda

1. Data structures (this morning)
    - How are core data structures implemented
    - Advanced core data structures (`Decimal` and `namedtuple`)
    - Dictionaries and their variants
2. Functions (this afternoon + Tuesday morning)
    - Functions as nouns, not just verbs -- function objects, and how they work
    - Attributes on function objects
    - Bytecodes
    - How are arguments mapped to parameters when we call a function? (positional and keyword arguments)
    - Special parameter types (`*args` and `**kwargs`), keyword only, defaults
    - Variable scoping (LEGB)
    - The enclosing scope -- closures, inner functions, and why we want them
    - Type hints/annotations, and what they are (and aren't)
    - Dispatch tables
3. Functional programming in Python (Tuesday afternoon)
    - Comprehensions (list, set, dict comprehensions -- including nested comprehensions)
    - Functions as arguments to other functions
    - `lambda` and its friends
4. Modules and packages (Tuesday afternoon)
5. Objects (Wednesday)
    - Classes
    - Methods
    - Instances
    - Attributes -- one of the most important things in all of Python (ICPO rule for attribute lookup)
    - Magic methods
    - Properties
    - Descriptors
    - Methods vs. functions -- how are they different, and how does `self` work?
6. Iterators and generators (Thursday morning)
    - Making your class iterable
    - Generator functions
    - Generator comprehensions (aka generator expressions)
7. Decorators (Thursday)
8. Concurrency
    - Threads and processes (multiprocessing)
    - `asyncio`, and how it works (and doesn't)
    - Where is this all going in the Python world?

In [1]:
# gitautopush -- on PyPI


# Assignment in Python

In a language like C, a variable is an alias to a location in memory. So when I assign to a variable, the value is being put in a particular location in memory. This is why we need to (a) declare our variables as having a certain type and (b) only certain types can be assigned to certain variables.

In Python, variables refer to values.  Variables are *not* memory locations! All values are objects, and all objects are on the heap. A variable is just a *pointer* to one of those values. This is why any Python variable can refer to any Python value, and why we don't (and cannot) declare our variables to be a particular type.

This is the definition of a *dynamic language*.  It doesn't mean that values don't have types. It just means that variables don't have types.

In [2]:
x = 5    # when we assign, we're saying that the variable on the left should refer to the value on the right

In [3]:
type(x)

int

In [4]:
# there is no way in Python for one variable to refer to another variable

x = 5
y = x    # this doesn't mean that y "follows x around." Rather, it says that y refers to whatever x currently refers to

y

5

In [5]:
x = 10  # we're reassigned x

y

5

In [6]:
# if the value is mutable, then things get stickier

x = [10, 20, 30]
y = x

x[0] = '!'   # I have modified the list to which x refers .. which is also the list to which y refers!
y

['!', 20, 30]

In [7]:
mylist = [10, 20, 30]
mylist.append(mylist)

mylist

[10, 20, 30, [...]]

In [8]:
del(mylist)   # delete the variable

In [9]:
x = None

In [10]:
type(x)    

NoneType

`None` exists so that we can say that we have no value, and make that distinct from 0, `False`, empty string, etc. It's its own value. In a boolean context (i.e., in an `if` statement), it is considered `False`. But if you check whether `None` is the same as something else, it isn't.

Where do we use `None`?

- A function that doesn't explicitly return a value returns `None`
- Many times, default argument values in functions have a value of `None`
- You'll see it for default attributes in objects, too

In [11]:
None == None

True

In [12]:
None == False

False

In [13]:
None == 0

False

In [15]:
# how can I check to see if something is None?

x = None

if x == None:   # unfortunately, this works -- it isn't Pythonic
    print('Yes! It is None!')

Yes! It is None!


`None` is a singleton object; every `None` in Python is not only equal to every other one, but it is the exact same object.

In [16]:
id(None)   # this returns the unique object number

4466115072

In [17]:
id(None)

4466115072

In [18]:
new_none = type(None)()

In [19]:
id(new_none)

4466115072

In [20]:
# According to PEP 8, the Python style guide, we shouldn't use "==" on singletons, especially with None.
# Rather, we should check the identity of the object with the "is" operator.

# "is" doesn't check whether two things are equal. It checks whether their ids are equal

In [21]:
id(None) == id(new_none)

True

In [22]:
# we can say the same thing, much better:

None is new_none

True

In [25]:
ni = type(NotImplemented)()

In [26]:
NotImplemented is ni

True

In [27]:
id(5)  # yes, you get back a unique ID of the object... which happens to be its address in memory!

4466981456

In [28]:
f = open('/etc/passwd')

f

<_io.TextIOWrapper name='/etc/passwd' mode='r' encoding='UTF-8'>

# What does `type` do?

It actually does *two* things:

1. If you give it just one argument, it returns the type of the object, basically the same thing that you would get from the object's `__class__` attribute.
2. If you give it three arguments, then you get back a new class which you have created. This is pretty rare.

In [29]:
type('abcd')   # this will return 'abcd'.__class__

str

In [30]:
x = 100
y = 100

x == y    # are these the same value?

True

In [31]:
x is y    # are these the same object in memory?

True

In [32]:
x = 1000
y = 1000

x == y

True

In [33]:
x is y

False

Python knows that we'll be using a lot of small integers, and thus creates -- when it starts up -- all of the integers from -5 to 256. So any time you use one of those integers, Python just grabs the object it already has available. Thus, these small integers will always be `is` to each other.

But once you get into larger integers, that's no longer the case.

In [34]:
x = 'abcd'
y = 'abcd'

x == y

True

In [35]:
x is y

True

In [36]:
x = 'abcd' * 10_000
y = 'abcd' * 10_000

x == y

True

In [37]:
x is y

False

In [38]:
x = 'ab.cd'
y = 'ab.cd'

x == y

True

In [39]:
x is y

False

What's going on?

When we assign to a variable in Python, that variable name is turned into a string, and is then used as the key in an internal dict to store our value. This means that every time we store or retrieve a variable's value, we're creating a new string.

Python's solution to this is that any short string (I think < 500 characters) that only contains characters that are legal in an identifier are cached. This means that the first time we see such a string, it's really created. The second and next times, we just reuse the same string.

- If the string is long, then this caching doesn't happen
- If the string contains `.` (or some other illegal character in an identifier), then it doesn't either.

This is transparent to us, but if you use `is` to compare strings, you'll discover it.

In [40]:
x = 100

globals()['x']  # retrieves the value of x

100

In [42]:
globals()['x'] = 9876

In [43]:
x

9876

Only use `is` to compare with `None`. Otherwise, use `==`.



In [45]:
# False is a singleton, too
bool(0) is bool(0)

True

In [46]:
# True is a singleton, too
bool(1) is bool(1)

True

# Integers

Many people new to Python ask: What is the biggest integer we're allowed? Or how many bits are our integers?

This is the *wrong* question to ask! Because integers are objects; they run themselves, and manage their own memory. Integers can be as big as you want, so long as you don't run out of memory.

In [47]:
import sys   

sys.getsizeof(0)   # how many bytes does something in Python take up?

28

In [51]:
sys.getsizeof(10_000_000_000_000_000)

32

In [52]:
x = 10_000_000_000_000_000
x = x ** 1000

In [53]:
sys.getsizeof(x)

7112

In [55]:
x = x ** 100

In [56]:
sys.getsizeof(x)

708704

In [57]:
# floats

0.1 + 0.2 

0.30000000000000004

In [58]:
# what if we could keep our number in decimal, and never go to binary?
# we can trade off longer execution and more memory with having greater precision

from decimal import Decimal 

x = Decimal('0.1')
y = Decimal('0.2')

x + y

Decimal('0.3')

In [59]:
float(x+y)

0.3

In [60]:
# another solution to the float problem: Use the builtin round() function

round(0.1 + 0.2, 2)  # round things off after 2 digits past the decimal point

0.3

In [61]:
# another solution: use ints!

In [62]:
sys.getsizeof(0.1)

24

In [63]:
sys.getsizeof(1234567890.1234567890)

24

In [64]:
sys.getsizeof(x)

104

In [None]:
# teraflops  -- trillions of floating point operations per second

In [65]:
x = 12345.6789
y = 98765.4321

%timeit x * y

26.1 ns ± 1.76 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [66]:
x = Decimal('12345.6789')
y = Decimal('98765.4321')

%timeit x * y

74.1 ns ± 2.01 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [68]:
# Python does have a third numeric type: Complex!

x = 20+3j
y = 15-8j

x + y

(35-5j)

In [69]:
sys.getsizeof(x)

32

In [70]:
sys.getsizeof(y)

32

# Lists

Many people, when they come to Python, call lists "arrays." This is not true! They aren't arrays, because:

1. Arrays have a fixed size, set when we create them
2. All of the elements of an array must be of the same type

Neither of these is true regarding lists. We can modify them (their contents and their lengths), and we can also put any type we want, and any combination of types we want in a list.

That said: The tradition in Python is to have lists contain only one type. 

But... behind the scenes, a list is implemented as an array.  How?

1. A list's array is allocated with extra space. So when we add new elements, those spaces are used.
2. When we run out of those spaces, then a new array is allocated, with extra space there, as well.
3. Since all values in Python are referred to via pointers, we can argue that the array contains only one type, namely `*PyObject` in C.