# 4 Introducing Python Object Types

**Intro**
Data takes the form of *objects* in Python. Objects are essentially just pieces of memory, with values and sets of associated operations.

Objects are the most fundamental notion in Python. Python ships with built-in objects.

**The Python Conceptual Hierarchy**
Python programs can be decomposed into modules, statements, expressions, and objects.
1. Programs are composed of modules.
2. Modules contain statements.
3. Statements contain expressions.
4. Expressions create and process objects.

Built-in types a mandatory exploration for all Python journeys. We will later be both using and emulating built-in object types to create our own objects with OOP (object-oriented programming). 

In lower-level languages such as C or C++ you need to lay out memory structures, manage memory allocation, etc. when implementing objects. In typical Python programming you don't have to worry about that. Python provides powerful object types as an intrinsic part of the language.

**Why Use Built-in Types?**
- Built-in objects make programs easier to write.
- Built-in objects are components of extensions. (For example a stack data structure may be implemted as a class that manages or customizes a built-in list)
- Built-in objects are often more efficient than custom data structures.
- Built-in objects are a standard part of the language. 

Built-in objects form the core of all Python programs.

**Python's Core Data Types**
Literals: An expression whose syntax generates an object.

Built-in objects:
- Numbers
- Strings
- Lists
- Dictionaries
- Tuples
- Files 
- Sets
- Other core types: Booleans, types, None
- Program unit types: Functions, modules classes
- Implemtation-related types: Compiled code, stack tracebacks

However, this list isn't really complete, as *everything* we process in Python is some kind of object.

Once you create an object, you bind its operation set for all time. (e.g. you can only perfrom string operations on a string). Python is *dynamically typed*, a model that keeps track of types for you automatically instead of requiring declaration code. 

**Numbers**

Fairly straight-forward, automatically provides extra precision for large numbers. 

In [3]:
3.1415*2

6.283

Two formats for display: as-code *repr* and the more user-friendly *str*

In [4]:
#Handful of useful numeric modules that ships with Python as well
import math
math.pi

3.141592653589793

In [12]:
import random
print(random.random())
print(random.choice([1,2,3,4]))

0.13773440822914507
3


**Strings**

Strings are used to record both textual information as well as arbitrary collections of bytes (an image for instance)
The first example of what in Python we call a *sequence*
-> A positionally ordered collection of other objects

In [19]:
S = 'Spam'
print(len(S))
print(S[0])
print(S[1])
print(S[-1])  # the same as the one below
print(S[len(S)-1])
print(S[-2])

4
S
p
m
m
a


Anywhere that Python expects a value, we can use a literal, a variable, or any expression we wich.

In [20]:
print(S)
print(S[1:3]) # Slice from and including 1 to 3.

Spam
pa


In a slice, the left bound defaults to zero, and the right bound defaults to the length of the sequence being sliced.

In [23]:
print(S[1:]) # Same as (1:len(S))
print(S) 
print(S[0:3])

pam
Spam
Spa


In [24]:
S + 'xyz' # Concatenation

'Spamxyz'

In [25]:
S * 8 # Repetition

'SpamSpamSpamSpamSpamSpamSpamSpam'

*Polymorphism*: The meaning of an operation depends on the objects being operated on.

Immutable objects: Objects that cannot be changed in place.

-> You can use expressions to create new immutable objects with the same name. (to "change" an immutable object) However not that efficiently.

Among other things immutability can be used in Python to guarantee that an object remains constant throughout the program.

It exists a data type for text-based data that is a sort of hybrid between immutable and mutable called the *bytearray*. The bytearray is mutable as long as the characters are at least 8 bits wide (e.g., ASCII) 

You can also change text-based data in place if you expand it to a list, and then join it back together to a string.



In [26]:
S = 'shrubbery'
L = list(S)
L

['s', 'h', 'r', 'u', 'b', 'b', 'e', 'r', 'y']

In [28]:
L[1] = 'c'

In [29]:
''.join(L)

'scrubbery'

In [30]:
B = bytearray(b'spam')

In [31]:
B.extend(b'eggs')

In [32]:
B

bytearray(b'spameggs')

In [33]:
B.decode()

'spameggs'

You can use generic sequence operations on strings, because strings are sequences. You can however also use *type-specific methods*. Methods are functions that are attached to and act upon a specific object.

In [35]:
dir('') # dir() can be used to show a list of all available methods attached to an object

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',


In [36]:
S = 'spam'
S.find('pa')

1

In [39]:
S.replace('pa','XYZ') # Returns a NEW string object where some characters has been "replaced"

'sXYZm'

S

P 102

In [2]:
line = 'aaa,bbbb,cccc,dddd'
line.split(',') # Split on a delimiter into a list of substrings

['aaa', 'bbbb', 'cccc', 'dddd']

In [3]:
S = 'spam'
S.upper()

'SPAM'

In [4]:
S.isalpha()

True

In [5]:
S.isdigit()

False

In [6]:
line = 'aaaa,bbb,cccc,dd\n\n   '

In [8]:
line.rstrip() # Strip whitespace characters off the end of the string

'aaaa,bbb,cccc,dd'

In [10]:
line.rstrip().split() # Strips before it splits, left to right

['aaaa,bbb,cccc,dd']

In [14]:
'{} and {} and {}'.format('ball','apple','sword') # new formatting method, numbers can be omitted

'ball and apple and sword'

In [13]:
'%s and %s and %s' % ('ball','football','snow') # Old formatting expression

'ball and football and snow'

In [16]:
'{:,.2f}'.format(294329234.235345435) # Separators, decimal digits

'294,329,234.24'

In [19]:
'%.2f | %+05d' % (3.14159, -42)

'3.14 | -0042'

**Note:** Python's tooset is layered: operations that span multiple types show up as built-in functios or expressions, but *type-specific* operations are method calls.

In [21]:
dir(S) # useful function that returns a list of all the attributes available for any object passed to it

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',


**Note:** Leading and trailing double underscores is the naming pattern Python uses for implementation details. 

In [25]:
help(S.replace) # asks what the replace method of a string does

Help on built-in function replace:

replace(old, new, count=-1, /) method of builtins.str instance
    Return a copy with all occurrences of substring old replaced by new.
    
      count
        Maximum number of occurrences to replace.
        -1 (the default value) means replace all occurrences.
    
    If the optional argument count is given, only the first count occurrences are
    replaced.



**Note:** You can also use the dir and help function with type names, as well as real objects. (type names e.g. str, list and dict)

In [26]:
S = 'A\nB\tC' #\n is end of line, \t is tab
len(S) # Each stands for just one character

5

In [27]:
ord('\n') # \n is a byte with the binary value 10 in ASCII

10

In [28]:
S = 'A\0b\0C' # \0, a binary zero byte, does not terminate string

In [29]:
len(S)

5

In [31]:
S # Non-printables are displyed as \xNN hex escapes

'A\x00b\x00C'

**Note:** Python allows strings to be encolsed in *single* or *double* quotes - they mean the same thing but allow the other type of quote to be embedded with an escape (most programmers prefer to use single quotes). It also allows multiline string literals encolsed in *triple* quotes. When this form is used all the lines are concatenated together, end-of-line characters are added where line breaks appear.

In [32]:
msg = """aklsjdasssssssssssssssssss
dsfssssssssssssss
dsfsfffffsdf"""

In [33]:
print(msg)

aklsjdasssssssssssssssssss
dsfssssssssssssss
dsfsfffffsdf


**Note:** Ptyhon also support a *raw* string liteal that turns off the backslash escape mechanism. (useful for Windows paths for example). Just add an r right before the string ''.

The escape *\u* can be used to embed actual Unicode code-point ordinal-value integers in text strings. Python 3.X does not allow normal and byte strings to mix without explicit conversion. (.encode/.decode-methods).

**Note**: Unicode processing mostly reduces to transferring text data to and from files - text is encoded to bytes when stored in a file, and decoded into characters (a.k.a. code points) when read back into memory. Once it is loaded, we usually process text as strings in decoded form only.

Files are also content-specific in 3.X: text files accept and return str strings, but binary files instead deal in bytes strings for raw binary data:

## Pattern Matching
Text pattern matching can be done using the built-in module *re*. This module has analogous calls for seacrching, splitting, and replacement, but because we can use patterns to specify substrings, we can be much more general:

In [1]:
import re
match = re.match('Hello[ +t]*(.*)world', 'Hello     Python world')

In [2]:
match.group(1)

'Python '

In [3]:
match = re.match('[/:](.*)[/:](.*)[/:](.*)', '/usr/home:lumberjack')

In [4]:
match.groups()

('usr', 'home', 'lumberjack')

In [7]:
match.group(3)

'lumberjack'

In [8]:
re.split('[/:]', '/usr/home/lumberjack')

['', 'usr', 'home', 'lumberjack']

# Lists

**Definition**: Positionally ordered collection of arbitrarily typed objects, and they have no fixed size. They are also *mutable*, in other words suceptable to in-place changes.

Supports sequence operations, e.g.

In [33]:
L = [123, 'spam', 1.23] # A list of three different-type objects
len(L) # Number of item in list

3

In [12]:
L[0] # Indexing by postion

123

In [13]:
L[:-1] # Slicing a list returns a new list

[123, 'spam']

In [15]:
L + [4,5,6] # Concat/repeat make new lists too

[123, 'spam', 1.23, 4, 5, 6]

In [16]:
L * 2

[123, 'spam', 1.23, 123, 'spam', 1.23]

## Type-Specific Operations

In [34]:
L.append('NI') # Growing: add object at end of list
L

[123, 'spam', 1.23, 'NI']

In [21]:
L.pop(1) # Shrinking: delete an item in list by position, and return that object

1.23

In [19]:
L

[123, 1.23, 'NI']

In [30]:
del L[1] # Another way to delete from list

In [35]:
L

[123, 'spam', 1.23, 'NI']

Other type-specific methods include *insert*, to insert an item at an arbitrary position, *remove*, to remove a given item by value, *extend*, to add multiple items at the end, and more

**Note**: Since lists are mutable, most list methods tend to change the object in place, instead of creating a new one.

**Bounds Checking**: To reference or assign items off the end of a list is always an error. To grow a list, use *.append()*.

**Nesting**: Python's core data types support arbitrary *nesting*. We can nest in any combination, as deeply as we like.

## Comprehensions

In [1]:
M = [[1,2,3],
    [4,5,6],
    [7,8,9]] # a 3x3 matrix

In [4]:
[row[1] for row in M] # Second column of M in a list

[2, 5, 8]

In [5]:
M

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

**Note:** List comprehensions can be more complex.

In [7]:
[row[1] + 1 for row in M] # Add 1 to each item in column 2

[3, 6, 9]

In [9]:
[row[1] for row in M if row[1] % 2 == 0] # filte out odd numbers

[2, 8]

In [10]:
diag = [M[i][i] for i in [0, 1, 2]]

In [11]:
diag

[1, 5, 9]

**Note:** List comprehensions can be used to iterate over any *iterable* object.

In [13]:
doubles = [c * 2 for c in 'spam'] # repeat characters of string

In [14]:
doubles

['ss', 'pp', 'aa', 'mm']

In [15]:
type(range(4))

range

In [16]:
range(4)

range(0, 4)

In [17]:
print(range(4))

range(0, 4)


In [18]:
list(range(4))

[0, 1, 2, 3]

In [19]:
list(range(-10,11,2))

[-10, -8, -6, -4, -2, 0, 2, 4, 6, 8, 10]

PAGE 112

**Note:** Comprehensions are not limited to lists. The same syntax can also be used to create for example generators (iterable objects that produces results on demand).

In [7]:
G = (sum(row) for row in M) # create a generator of row sums

In [3]:
next(G)

6

In [4]:
next(G)

15

In [5]:
next(G)

24

In [6]:
next(G)

StopIteration: 

In [9]:
list(map(sum,M)) # Map sum over items in M

[6, 15, 24]

In [12]:
{sum(row) for row in M} # set comprehension

{6, 15, 24}

In [13]:
{i:sum(M[i]) for i in range(3)} # dictionary comprehension

{0: 6, 1: 15, 2: 24}

# Dictionaries

**Definition**: Dictionaries are mappings; a collection of stored objects by key (contrary to lists where relative position is relevant). Dictionaries are also mutable.

In [15]:
D = {'food': 'Spam', 'quantity': 4, 'color': 'pink'}

In [17]:
D['food'] # Fetch value by key

'Spam'

In [21]:
D['quantity'] += 1 # Add 1 to 'quantity' value

In [22]:
D

{'food': 'Spam', 'quantity': 6, 'color': 'pink'}

**Note:** We most often see dictionaries built up in other ways than the curly-braces literal form.

In [23]:
D = {}

In [27]:
D['name'] = 'Bob' # Create keys by assignment

In [25]:
D['job'] = 'dev'

In [26]:
D['age'] = 40

In [28]:
D

{'name': 'Bob', 'job': 'dev', 'age': 40}

In [29]:
print(D['name'])

Bob


In [30]:
bob1 = dict(name='Bob', job='dev', age=40)

In [31]:
bob1

{'name': 'Bob', 'job': 'dev', 'age': 40}

In [34]:
bob2 = dict(zip(['name', 'job', 'age'],['Bob','dev',40]))

In [35]:
bob2

{'name': 'Bob', 'job': 'dev', 'age': 40}

**Note:** Dictionaries can, as we saw with lists, be arbitrarily nested.

In [36]:
rec =  {'name': {'first': 'Bob', 'last': 'Smith'},
       'jobs': ['dev', 'mgr'],
       'age': 40.5}

In [37]:
rec

{'name': {'first': 'Bob', 'last': 'Smith'},
 'jobs': ['dev', 'mgr'],
 'age': 40.5}

In [38]:
rec['name']

{'first': 'Bob', 'last': 'Smith'}

In [39]:
rec['name']['last']

'Smith'

In [40]:
rec['jobs']

['dev', 'mgr']

In [41]:
rec['jobs'][-1]

'mgr'

In [42]:
rec['jobs'].append('janitor')

In [43]:
rec['jobs']

['dev', 'mgr', 'janitor']

**Note:** As you can see, nesting allows us to build up complex information structures directly and easily.