# 4 Introducing Python Object Types

**Intro**
Data takes the form of *objects* in Python. Objects are essentially just pieces of memory, with values and sets of associated operations.

Objects are the most fundamental notion in Python. Python ships with built-in objects.

**The Python Conceptual Hierarchy**
Python programs can be decomposed into modules, statements, expressions, and objects.
1. Programs are composed of modules.
2. Modules contain statements.
3. Statements contain expressions.
4. Expressions create and process objects.

Built-in types a mandatory exploration for all Python journeys. We will later be both using and emulating built-in object types to create our own objects with OOP (object-oriented programming). 

In lower-level languages such as C or C++ you need to lay out memory structures, manage memory allocation, etc. when implementing objects. In typical Python programming you don't have to worry about that. Python provides powerful object types as an intrinsic part of the language.

**Why Use Built-in Types?**
- Built-in objects make programs easier to write.
- Built-in objects are components of extensions. (For example a stack data structure may be implemted as a class that manages or customizes a built-in list)
- Built-in objects are often more efficient than custom data structures.
- Built-in objects are a standard part of the language. 

Built-in objects form the core of all Python programs.

**Python's Core Data Types**
Literals: An expression whose syntax generates an object.

Built-in objects:
- Numbers
- Strings
- Lists
- Dictionaries
- Tuples
- Files 
- Sets
- Other core types: Booleans, types, None
- Program unit types: Functions, modules classes
- Implemtation-related types: Compiled code, stack tracebacks

However, this list isn't really complete, as *everything* we process in Python is some kind of object.

Once you create an object, you bind its operation set for all time. (e.g. you can only perfrom string operations on a string). Python is *dynamically typed*, a model that keeps track of types for you automatically instead of requiring declaration code. 

**Numbers**

Fairly straight-forward, automatically provides extra precision for large numbers. 

In [3]:
3.1415*2

6.283

Two formats for display: as-code *repr* and the more user-friendly *str*

In [4]:
#Handful of useful numeric modules that ships with Python as well
import math
math.pi

3.141592653589793

In [12]:
import random
print(random.random())
print(random.choice([1,2,3,4]))

0.13773440822914507
3


**Strings**

Strings are used to record both textual information as well as arbitrary collections of bytes (an image for instance)
The first example of what in Python we call a *sequence*
-> A positionally ordered collection of other objects

In [19]:
S = 'Spam'
print(len(S))
print(S[0])
print(S[1])
print(S[-1])  # the same as the one below
print(S[len(S)-1])
print(S[-2])

4
S
p
m
m
a


Anywhere that Python expects a value, we can use a literal, a variable, or any expression we wich.

In [20]:
print(S)
print(S[1:3]) # Slice from and including 1 to 3.

Spam
pa


In a slice, the left bound defaults to zero, and the right bound defaults to the length of the sequence being sliced.

In [23]:
print(S[1:]) # Same as (1:len(S))
print(S) 
print(S[0:3])

pam
Spam
Spa


In [24]:
S + 'xyz' # Concatenation

'Spamxyz'

In [25]:
S * 8 # Repetition

'SpamSpamSpamSpamSpamSpamSpamSpam'

*Polymorphism*: The meaning of an operation depends on the objects being operated on.

Immutable objects: Objects that cannot be changed in place.

-> You can use expressions to create new immutable objects with the same name. (to "change" an immutable object) However not that efficiently.

Among other things immutability can be used in Python to guarantee that an object remains constant throughout the program.

It exists a data type for text-based data that is a sort of hybrid between immutable and mutable called the *bytearray*. The bytearray is mutable as long as the characters are at least 8 bits wide (e.g., ASCII) 

You can also change text-based data in place if you expand it to a list, and then join it back together to a string.



In [26]:
S = 'shrubbery'
L = list(S)
L

['s', 'h', 'r', 'u', 'b', 'b', 'e', 'r', 'y']

In [28]:
L[1] = 'c'

In [29]:
''.join(L)

'scrubbery'

In [30]:
B = bytearray(b'spam')

In [31]:
B.extend(b'eggs')

In [32]:
B

bytearray(b'spameggs')

In [33]:
B.decode()

'spameggs'

You can use generic sequence operations on strings, because strings are sequences. You can however also use *type-specific methods*. Methods are functions that are attached to and act upon a specific object.

In [35]:
dir('') # dir() can be used to show a list of all available methods attached to an object

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',


In [36]:
S = 'spam'
S.find('pa')

1

In [39]:
S.replace('pa','XYZ') # Returns a NEW string object where some characters has been "replaced"

'sXYZm'

S

P 102

In [2]:
line = 'aaa,bbbb,cccc,dddd'
line.split(',') # Split on a delimiter into a list of substrings

['aaa', 'bbbb', 'cccc', 'dddd']

In [3]:
S = 'spam'
S.upper()

'SPAM'

In [4]:
S.isalpha()

True

In [5]:
S.isdigit()

False

In [6]:
line = 'aaaa,bbb,cccc,dd\n\n   '

In [8]:
line.rstrip() # Strip whitespace characters off the end of the string

'aaaa,bbb,cccc,dd'

In [10]:
line.rstrip().split() # Strips before it splits, left to right

['aaaa,bbb,cccc,dd']

In [14]:
'{} and {} and {}'.format('ball','apple','sword') # new formatting method, numbers can be omitted

'ball and apple and sword'

In [13]:
'%s and %s and %s' % ('ball','football','snow') # Old formatting expression

'ball and football and snow'

In [16]:
'{:,.2f}'.format(294329234.235345435) # Separators, decimal digits

'294,329,234.24'

In [19]:
'%.2f | %+05d' % (3.14159, -42)

'3.14 | -0042'

**Note:** Python's tooset is layered: operations that span multiple types show up as built-in functios or expressions, but *type-specific* operations are method calls.

In [21]:
dir(S) # useful function that returns a list of all the attributes available for any object passed to it

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',


**Note:** Leading and trailing double underscores is the naming pattern Python uses for implementation details. 

In [25]:
help(S.replace) # asks what the replace method of a string does

Help on built-in function replace:

replace(old, new, count=-1, /) method of builtins.str instance
    Return a copy with all occurrences of substring old replaced by new.
    
      count
        Maximum number of occurrences to replace.
        -1 (the default value) means replace all occurrences.
    
    If the optional argument count is given, only the first count occurrences are
    replaced.



**Note:** You can also use the dir and help function with type names, as well as real objects. (type names e.g. str, list and dict)

In [26]:
S = 'A\nB\tC' #\n is end of line, \t is tab
len(S) # Each stands for just one character

5

In [27]:
ord('\n') # \n is a byte with the binary value 10 in ASCII

10

In [28]:
S = 'A\0b\0C' # \0, a binary zero byte, does not terminate string

In [29]:
len(S)

5

In [31]:
S # Non-printables are displyed as \xNN hex escapes

'A\x00b\x00C'

**Note:** Python allows strings to be encolsed in *single* or *double* quotes - they mean the same thing but allow the other type of quote to be embedded with an escape (most programmers prefer to use single quotes). It also allows multiline string literals encolsed in *triple* quotes. When this form is used all the lines are concatenated together, end-of-line characters are added where line breaks appear.

In [32]:
msg = """aklsjdasssssssssssssssssss
dsfssssssssssssss
dsfsfffffsdf"""

In [33]:
print(msg)

aklsjdasssssssssssssssssss
dsfssssssssssssss
dsfsfffffsdf


**Note:** Ptyhon also support a *raw* string liteal that turns off the backslash escape mechanism. (useful for Windows paths for example). Just add an r right before the string ''.

The escape *\u* can be used to embed actual Unicode code-point ordinal-value integers in text strings. Python 3.X does not allow normal and byte strings to mix without explicit conversion. (.encode/.decode-methods).

**Note**: Unicode processing mostly reduces to transferring text data to and from files - text is encoded to bytes when stored in a file, and decoded into characters (a.k.a. code points) when read back into memory. Once it is loaded, we usually process text as strings in decoded form only.

Files are also content-specific in 3.X: text files accept and return str strings, but binary files instead deal in bytes strings for raw binary data:

## Pattern Matching
Text pattern matching can be done using the built-in module *re*. This module has analogous calls for seacrching, splitting, and replacement, but because we can use patterns to specify substrings, we can be much more general:

In [1]:
import re
match = re.match('Hello[ +t]*(.*)world', 'Hello     Python world')

In [2]:
match.group(1)

'Python '

In [3]:
match = re.match('[/:](.*)[/:](.*)[/:](.*)', '/usr/home:lumberjack')

In [4]:
match.groups()

('usr', 'home', 'lumberjack')

In [7]:
match.group(3)

'lumberjack'

In [8]:
re.split('[/:]', '/usr/home/lumberjack')

['', 'usr', 'home', 'lumberjack']

# Lists

**Definition**: Positionally ordered collection of arbitrarily typed objects, and they have no fixed size. They are also *mutable*, in other words suceptable to in-place changes.

Supports sequence operations, e.g.

In [33]:
L = [123, 'spam', 1.23] # A list of three different-type objects
len(L) # Number of item in list

3

In [12]:
L[0] # Indexing by postion

123

In [13]:
L[:-1] # Slicing a list returns a new list

[123, 'spam']

In [15]:
L + [4,5,6] # Concat/repeat make new lists too

[123, 'spam', 1.23, 4, 5, 6]

In [16]:
L * 2

[123, 'spam', 1.23, 123, 'spam', 1.23]

## Type-Specific Operations

In [34]:
L.append('NI') # Growing: add object at end of list
L

[123, 'spam', 1.23, 'NI']

In [21]:
L.pop(1) # Shrinking: delete an item in list by position, and return that object

1.23

In [19]:
L

[123, 1.23, 'NI']

In [30]:
del L[1] # Another way to delete from list

In [35]:
L

[123, 'spam', 1.23, 'NI']

Other type-specific methods include *insert*, to insert an item at an arbitrary position, *remove*, to remove a given item by value, *extend*, to add multiple items at the end, and more

**Note**: Since lists are mutable, most list methods tend to change the object in place, instead of creating a new one.

**Bounds Checking**: To reference or assign items off the end of a list is always an error. To grow a list, use *.append()*.

**Nesting**: Python's core data types support arbitrary *nesting*. We can nest in any combination, as deeply as we like.

## Comprehensions

In [1]:
M = [[1,2,3],
    [4,5,6],
    [7,8,9]] # a 3x3 matrix

In [4]:
[row[1] for row in M] # Second column of M in a list

[2, 5, 8]

In [5]:
M

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

**Note:** List comprehensions can be more complex.

In [7]:
[row[1] + 1 for row in M] # Add 1 to each item in column 2

[3, 6, 9]

In [9]:
[row[1] for row in M if row[1] % 2 == 0] # filte out odd numbers

[2, 8]

In [10]:
diag = [M[i][i] for i in [0, 1, 2]]

In [11]:
diag

[1, 5, 9]

**Note:** List comprehensions can be used to iterate over any *iterable* object.

In [13]:
doubles = [c * 2 for c in 'spam'] # repeat characters of string

In [14]:
doubles

['ss', 'pp', 'aa', 'mm']

In [15]:
type(range(4))

range

In [16]:
range(4)

range(0, 4)

In [17]:
print(range(4))

range(0, 4)


In [18]:
list(range(4))

[0, 1, 2, 3]

In [19]:
list(range(-10,11,2))

[-10, -8, -6, -4, -2, 0, 2, 4, 6, 8, 10]

PAGE 112

**Note:** Comprehensions are not limited to lists. The same syntax can also be used to create for example generators (iterable objects that produces results on demand).

In [7]:
G = (sum(row) for row in M) # create a generator of row sums

In [3]:
next(G)

6

In [4]:
next(G)

15

In [5]:
next(G)

24

In [6]:
next(G)

StopIteration: 

In [9]:
list(map(sum,M)) # Map sum over items in M

[6, 15, 24]

In [12]:
{sum(row) for row in M} # set comprehension

{6, 15, 24}

In [13]:
{i:sum(M[i]) for i in range(3)} # dictionary comprehension

{0: 6, 1: 15, 2: 24}

# Dictionaries

**Definition**: Dictionaries are mappings; a collection of stored objects by key (contrary to lists where relative position is relevant). Dictionaries are also mutable.

In [15]:
D = {'food': 'Spam', 'quantity': 4, 'color': 'pink'}

In [17]:
D['food'] # Fetch value by key

'Spam'

In [21]:
D['quantity'] += 1 # Add 1 to 'quantity' value

In [22]:
D

{'food': 'Spam', 'quantity': 6, 'color': 'pink'}

**Note:** We most often see dictionaries built up in other ways than the curly-braces literal form.

In [23]:
D = {}

In [27]:
D['name'] = 'Bob' # Create keys by assignment

In [25]:
D['job'] = 'dev'

In [26]:
D['age'] = 40

In [28]:
D

{'name': 'Bob', 'job': 'dev', 'age': 40}

In [29]:
print(D['name'])

Bob


In [30]:
bob1 = dict(name='Bob', job='dev', age=40)

In [31]:
bob1

{'name': 'Bob', 'job': 'dev', 'age': 40}

In [34]:
bob2 = dict(zip(['name', 'job', 'age'],['Bob','dev',40]))

In [35]:
bob2

{'name': 'Bob', 'job': 'dev', 'age': 40}

**Note:** Dictionaries can, as we saw with lists, be arbitrarily nested.

In [36]:
rec =  {'name': {'first': 'Bob', 'last': 'Smith'},
       'jobs': ['dev', 'mgr'],
       'age': 40.5}

In [37]:
rec

{'name': {'first': 'Bob', 'last': 'Smith'},
 'jobs': ['dev', 'mgr'],
 'age': 40.5}

In [38]:
rec['name']

{'first': 'Bob', 'last': 'Smith'}

In [39]:
rec['name']['last']

'Smith'

In [40]:
rec['jobs']

['dev', 'mgr']

In [41]:
rec['jobs'][-1]

'mgr'

In [42]:
rec['jobs'].append('janitor')

In [43]:
rec['jobs']

['dev', 'mgr', 'janitor']

**Note:** As you can see, nesting allows us to build up complex information structures directly and easily.

In [1]:
D = {'a': 1, 'b': 2, 'c': 3}

In [2]:
D

{'a': 1, 'b': 2, 'c': 3}

In [3]:
D['e'] = 99 # Assigning a new key grows dictionaries

In [4]:
D

{'a': 1, 'b': 2, 'c': 3, 'e': 99}

In [5]:
D['f'] # Referencing a nonexistent key is an error

KeyError: 'f'

**Note:** To fetch a non-existant key in a dictionary is always an error. To handle cases where we don't know if the key is there, we can for example test ahead of time using the *in* membership expression.

In [6]:
'f' in D

False

In [7]:
if not 'f' in D:
    print('missing')

missing


**Note:** There are a variety of other ways to avoid accessing nonexistent keys in dictionaries as well: the *get* method, a conditional index with a default, the *try* statement and the *if/else* ternary expression.

In [8]:
value = D.get('x',0) # Index but with a default

In [9]:
value

0

In [10]:
value = D['x'] if 'x' in D else 0 # if/else expression form

In [11]:
value

0

## Sorting Keys: for Loops

In [12]:
D = {'a': 1, 'b': 2, 'c': 3}

In [13]:
D

{'a': 1, 'b': 2, 'c': 3}

In [22]:
Ks = list(D.keys()) # Unordered keys list

In [23]:
D.keys()

dict_keys(['a', 'b', 'c'])

In [24]:
list(D.keys())

['a', 'b', 'c']

In [27]:
Ks = list(D.keys())

In [28]:
Ks

['a', 'b', 'c']

In [31]:
Ks.sort() # Sorted keys list

In [30]:
Ks

['a', 'b', 'c']

In [33]:
for key in Ks: # Iterate through sorted keys
    print(key, '=>', D[key])

a => 1
b => 2
c => 3


In [35]:
sorted(D) # Built-in function which returns a sorted list of the keys

['a', 'b', 'c']

In [36]:
for key in sorted(D):
    print(key, '=>', D[key])

a => 1
b => 2
c => 3


**Note:** The *for* loop is even more general then this; it can be used on all sequences, and even on some things that are not.

In [37]:
x = 4
while x > 0:
    print('spam!' * x)
    x -= 1

spam!spam!spam!spam!
spam!spam!spam!
spam!spam!
spam!


## Iteration and Optimization

An object is *iterable* if it is either a physically stored sequence in memory, or an object that generates one item at a time in the contect of an iteration operation - a sort of "virtual" sequence.

Objects are consideres an iterable when they respond to the *iter* call with an object that advances in responce to *next* calls and raises an exception when finished producing values.

The *generator* we saw earlies is such an object. This object does not store its values in meory all at once - they are produced on demand.

In [38]:
# All list comprehensions can always be coded as an equivalent *for* loop
# that builds the result list manually by appending as it goes.
squares = [x ** 2 for x in [1, 2, 3, 4, 5]]

In [39]:
squares

[1, 4, 9, 16, 25]

In [40]:
squares = []
for x in [1, 2, 3, 4, 5]:
    squares.append(x ** 2)

In [41]:
squares

[1, 4, 9, 16, 25]

**Note:** Comprehensions generally runs faster in Python. A major rule of thumb in Python is to code for simplicity and readability first and worry about performance later.

# Tuples

**Definiton**: Roughly like lists but *immutable* (unchangable). Functionally used to represent ficed collection of items (e.g. a specific calendar date).

In [42]:
T = (1, 2, 3, 4)

In [48]:
len(T) # Length

4

In [49]:
T + (5, 6) # Concatenation

(1, 2, 3, 4, 5, 6)

In [50]:
T[0] # Indexing

1

In [51]:
T[1:] # Slicing

(2, 3, 4)

**Note:** Tuples also have type-specific callable methodds, but not nearly as many as lists.

In [53]:
T.index(4) # Tuple methods: 4 appears at offset 3

3

In [54]:
T.count(4) # 4 appears once

1

In [55]:
T[0] = 2 # Tuples are immutable

TypeError: 'tuple' object does not support item assignment

In [58]:
T = (2,) + T[1:] # Make a new tuple for a new value, trailing comma for one-item tuple

In [57]:
T

(2, 2, 3, 4)

**Note:** Tuples support arbitrarily mixing types and nesting like lists and dictionaries. They can't however grow and shrink on demand, because they are immutable.

In [59]:
T = 'spam', 3.0, [11, 22, 33]

In [60]:
T

('spam', 3.0, [11, 22, 33])

In [61]:
T[2][1]

22

In [62]:
T.append(4)

AttributeError: 'tuple' object has no attribute 'append'

### Why Tuples?
In practice not used as much as lists. However, their immutability provide a sort of integrity constraint, useful in large programs.

# Files

**Definiton**: File objects are Python code's main interface to external files on your computer. Files are a core type, but they have no literal syntax for creation. Created by the built-in *open* function.

In [64]:
f = open('data.txt', 'w') # Make a new file in output mode ('w' is write)
f.write('Hello\n') # Write strings of characters to it

6

In [65]:
f.write('world\n') # Return number of items written in Python 3.X

6

In [66]:
f.close() # Close to flust output buffers to disk

**Note:** A file's contents are always a string in your script, regardless of the type of data the file contains.

In [67]:
f = open('data.txt') # 'r' (read) is the default processing mode

In [68]:
text = f.read() # Read entire file into a string

In [69]:
text

'Hello\nworld\n'

In [70]:
print(text) # print interprets control characters

Hello
world



In [71]:
text.split() # File content is always a string

['Hello', 'world']

**Note:** However, the best way to read files today is to *not read it at all* - files provide an *iterator* that automatically reads line by line in *for* loops and other contexts.

In [72]:
for line in open('data.txt'): print(line)

Hello

world



In [73]:
dir(f)

['_CHUNK_SIZE',
 '__class__',
 '__del__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__enter__',
 '__eq__',
 '__exit__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__next__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '_checkClosed',
 '_checkReadable',
 '_checkSeekable',
 '_checkWritable',
 '_finalizing',
 'buffer',
 'close',
 'closed',
 'detach',
 'encoding',
 'errors',
 'fileno',
 'flush',
 'isatty',
 'line_buffering',
 'mode',
 'name',
 'newlines',
 'read',
 'readable',
 'readline',
 'readlines',
 'reconfigure',
 'seek',
 'seekable',
 'tell',
 'truncate',
 'writable',
 'write',
 'write_through',
 'writelines']

In [74]:
help(f.seek)

Help on built-in function seek:

seek(cookie, whence=0, /) method of _io.TextIOWrapper instance
    Change stream position.
    
    Change the stream position to the given byte offset. The offset is
    interpreted relative to the position indicated by whence.  Values
    for whence are:
    
    * 0 -- start of stream (the default); offset should be zero or positive
    * 1 -- current stream position; offset may be negative
    * 2 -- end of stream; offset is usually negative
    
    Return the new absolute position.



## Binary Bytes Files

**Note:** Python draws a sharp distinction between text and binary data types. Text <-> str strings. binary files <-> special *bytes* string (allows you to access file content unaltered).

In [7]:
t = b'spam\xFF'

In [8]:
t

b'spam\xff'

In [3]:
type(t)

bytes

In [4]:
dir(t)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'center',
 'count',
 'decode',
 'endswith',
 'expandtabs',
 'find',
 'fromhex',
 'hex',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdigit',
 'islower',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']

**Note:** Python's *struct* module can both create and unpack *binary data* - raw bytes that revord values that are not Python objects - to be written to a file in binary mode.

In [16]:
import struct
packed = struct.pack('>i4sh', 7, b'spam', 8) # Create packed binary data

In [17]:
packed

b'\x00\x00\x00\x07spam\x00\x08'

In [19]:
file = open('data.bin', 'wb') # Open binary output file

In [20]:
file.write(packed) # Write packed binary data

10

In [21]:
file.close()

**Note:** Reading binary data is essentially symmetric.

In [22]:
data = open('data.bin', 'rb').read()

In [23]:
print(data)

b'\x00\x00\x00\x07spam\x00\x08'


In [24]:
data

b'\x00\x00\x00\x07spam\x00\x08'

In [30]:
list(data)

[0, 0, 0, 7, 115, 112, 97, 109, 0, 8]

In [31]:
struct.unpack('>i4sh', data)

(7, b'spam', 8)

## Unicode Text Files

In [36]:
S = 'sp\xc4m' # Non-ASCII Unicode text

In [33]:
S

'spÄm'

In [35]:
S[2] # Sequence of characters

'Ä'

In [37]:
file = open('unidata.txt', 'w', encoding='utf-8')

In [38]:
file.write(S)

4

In [39]:
file.close()

In [40]:
text = open('unidata.txt', encoding='utf-8').read()

In [41]:
text

'spÄm'

In [42]:
len(text)

4

**Note:** As you can see the encoding and decoding goes automatically during transfers. You can however see what's truly stored if you so want.

In [43]:
raw = open('unidata.txt', 'rb').read()

In [44]:
raw

b'sp\xc3\x84m'

In [45]:
len(raw)

5

In [46]:
text

'spÄm'

In [47]:
text.encode('utf-8')

b'sp\xc3\x84m'

In [48]:
raw.decode('utf-8')

'spÄm'

### Other File-Like Tools
There also exists several other file-like tools that you most likely will meet later in your Python career, including pipes, FIFOs, sockets, keyed-access files, persistent object shelves, descriptor-based files, relational and object-oriented database interfaced (..).

# Other Core Types

## Sets
**Definition:** Unordered collections of unique and immutable objects. Serve as some kind of keys in a valueless dictionary. 

Can be created with the literal syntax *{}* or using the built-in set function.

In [2]:
X = set('spam') # Make a set out of a sequence

In [3]:
Y = {'h', 'a', 'm'}

In [51]:
X, Y

({'a', 'm', 'p', 's'}, {'a', 'h', 'm'})

In [59]:
X & Y # Intersection

{'a', 'm'}

In [60]:
X | Y # Union

{'a', 'h', 'm', 'p', 's'}

In [61]:
X - Y # Difference

{'p', 's'}

In [5]:
X > Y # Superset

False

In [6]:
set('spam') > set('pa')

True

In [7]:
set('pa') > set('spam')

False

In [9]:
{n ** 2 for n in [1, 2, 3, 4]} # Set comprehensions

{1, 4, 9, 16}

In [11]:
list(set([1,2,3,2,2,3])) # Filtering out duplicates (possibly reordered)

[1, 2, 3]

In [12]:
set('spam') - set ('ham')

{'p', 's'}

In [13]:
set('spam') == set('asmp') # Order-neutral equality tests 

True

**Note:** Sets, and all other collection types in Python, support *in* membership tests.

In [14]:
's' in set('spam'), 's' in 'spam', 'ham' in ['eggs', 'spam', 'ham']

(True, True, True)

**Note:** Python has quite a few numeric types, inlcuding *decimal* numbers, which are fixed-precision floating-point numbers, and *fraction* numbers, which are rational numbers with bot a numerator and a denominator.

**Tip:** These numeric types can be used to work around te limitations and inherent inaccuracies of floating-point math.

In [17]:
1/3 # Floating point

0.3333333333333333

In [16]:
(2/3) + (1/2)

1.1666666666666665

In [18]:
import decimal
d = decimal.Decimal('3.141')

In [19]:
d + 1

Decimal('4.141')

In [22]:
decimal.getcontext().prec = 2
decimal.Decimal('1.00') / decimal.Decimal('3.00')

Decimal('0.33')

In [23]:
from fractions import Fraction
f= Fraction(2, 3)

In [24]:
f+1

Fraction(5, 3)

In [25]:
f + Fraction(1, 2)

Fraction(7, 6)

**Note:** *Booleans* are predefined **True** and **False** objects that are essentially just the integers 1 and 0 with custom display logic. Python also has a special placeholder object called **None** commonly used to initialize names and objects.

In [26]:
1 > 2, 1 < 2

(False, True)

In [27]:
bool('spam') # Object's Boolean value

True

In [28]:
X = None # None placeholder

In [29]:
print(X)

None


In [32]:
L = [None] * 100 # Initialize a list of 100 Nones

**Note:** Python has a *type* built-in function that returns a *type* object revealing the type of another object. In Python 3.X types are classes, and vice versa.

In [36]:
print(type(L))

<class 'list'>


In [35]:
print(type(type(L)))

<class 'type'>


In [38]:
if type(L) == type([]): # Type testing, if you must ..
    print('yes')

yes


In [39]:
if type(L) == list: # Using the type name
    print('yes')

yes


In [41]:
if isinstance(L, list): # Object-oriented tests
    print('yes')


yes


**Important:** Almost always wrong to do type testing in Python, because it breaks your code's flexibility. This is related to the idea of *polymorphism*.

## User-Defined Classes

**Definition:** In abstract terms, classes define new types of objects that extend the core set.

In [43]:
class Worker:
    def __init__(self, name, pay): # Initialize when created
        self.name = name # self is the new object
        self.pay = pay
    def lastName(self):
        return self.name.split()[-1] # Split strings on blanks, return last item
    def giveRaise(self, percent):
        self.pay *= (1.0 + percent)

    

This class defines a new kind of object that will hace **name** and **pay** attributes (sometimes called *state information*), as well as two bits behaviour coded as functions (normally called *methods*). 

In [46]:
bob = Worker('Bob Smith', 50000) # Make two instances
sue = Worker('Sue Jones', 60000) # Eacg gas name and pay attrs

In [45]:
dir(bob)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'giveRaise',
 'lastName',
 'name',
 'pay']

In [47]:
bob.lastName() # Call method: bob is self

'Smith'

In [48]:
sue.lastName() # sue is the self subject

'Jones'

In [51]:
sue.pay

60000

In [52]:
sue.giveRaise(.10)

In [53]:
sue.pay

66000.0

**Note:** The reason why we call this model *object-oriented* is because there is always an imlied subject in functions within a class. A class builds on and uses core types. The *Worker* object here for instance is just a collection of a string and a number, plus functions for processing those two built-in objects.

**Finishing Note:** Even though everything in Python is an "object", only those types of objects we've met so far are consideres part of Python's core type set. Other types are either objects related to program execution (e.g. functions, modules, classes and compiled code), or are implemented by imported module functions, not language syntax.

Also keep in mind that the object's we've met here are objects, but no necessarily *object-oriented*. (Object-oriented usually requieres inheritance and the Python **class** statement)

In [54]:
s = {'h', 'y', 'z'}

In [55]:
for x in s: print(x)

z
y
h


# Chapter 5: Numeric Types

**Intro:** Begins an in-depth tour of the Python language. Objects -> the most fundamental notion in Python -> a good place  to start. Objects is the basis of every Python program you will ever write. We will start with Python's numeric types and operations.

## Numeric Type Basics

Used to represent just about any numeric quantity. The inventory of Python's numeric toolbox includes:
- Integer and floating-point objects
- Complex number objects
- Decimal: fixed-precision objects
- Fraction: rational number objects
- Sets: collections with numeric operations
- Booleans: true and false
- Built-in functions and modules: **round, math, random,** etc.
- Expressions; unlimited integer precision; bitwise operations; hex, octal, and binary formats
- Third-party extensions: vectors, libraries, visualization, plotting, etc.

*Integer and floating-point literals*
- Integers written as strings of decimal digits. Floating-point numbers consists of a decimal point and/or an optional signed exponent (e or E). A decimal point or exponent immediatly turns the object into a floating-point object. Floating-points implemented as C "doubles" -> get as much precision as the C compiler used to build the Python interpreter gives to doubles.

*Integers*
- Has unlimited precision.

*Hexadecimal, octal, and binary literals*
- Integers can be coded with different bases: decimal (base 10), hexadecimal (base 16), octal (base 8), or binary (base 2). The last three are common in some programmin domains. Hexadecimals start with a leading **0x** or **0X**, followed by a string of hexadecimal digits (0-9 and A-F). Hex digits may be coded in lower- or uppercase. Octal l

P. 135