# Introducing Python Object Types (Data Structures)

Python program can be decomposed into modules, statements, expressions, and objects, as follows:

1. Programs composed of modules.
2. Modules contain statements.
3. Statements contain expressions.
4. Expressions create and process objects.

## Numbers

In [1]:
# Integer Addition
123 + 222

345

In [2]:
# Floating-point multiplication
1.5 * 4

6.0

In [3]:
# Power operation
2 ** 100

1267650600228229401496703205376

Besides expressions, there are a handful of a useful numeric modules that ship with Python - modules are just packages of additional tools we import to use:

In [4]:
import math
math.pi

3.141592653589793

In [5]:
math.sqrt(81)

9.0

## Strings

In [6]:
# Make a 4-character string, and assign it to a name
S = 'Spam'

In [7]:
# Length
len(S)

4

In [8]:
# The first item in S, indexing by zero-based position
S[0]

'S'

In [9]:
# The second item from the left
S[1]

'p'

In [10]:
# The last item from the end in S
S[-1]

'm'

In [11]:
# The second to last item from the end
S[-2]

'a'

In [12]:
# Slice of S from offsets 1 through 2 (not 3)
S[1:3]

'pa'

In [13]:
# Everything past the first
S[1:]

'pam'

In [14]:
# Everything but the last
S[:3]

'Spa'

In [15]:
# All of S as a top-level copy
S[:]

'Spam'

In [16]:
# Concatenation
S + 'xyz'

'Spamxyz'

In [17]:
# Repetition
S * 8

'SpamSpamSpamSpamSpamSpamSpamSpam'

**Polymorphism:** Notice that the plus sign ( + ) means different things for different objects: addition for numbers, and concatenation for strings. This is a general property of Python that we’ll call polymorphism, the meaning of an operation depends on the objects being operated on.

**Immutability:** Strings are immutable in Python -- they cannot be changed in place after they are created. For example, you can’t change a string by assigning to one of its positions, but you can always build a new one and assign it to the same name. Immutability can be used to guarantee that an object remains constant throughout your program

In [18]:
# Immutable objects cannot be changed
S[0] = 'z'

TypeError: 'str' object does not support item assignment

In [19]:
# But we can run expressions to make new objects
S = 'z' + S[1:]
S

'zpam'

Every object in Python is classified as either immutable (unchangeable) or not. In terms of the core types, *numbers*, *strings*, and *tuples* are immutable; *lists*, *dictionaries*, and *sets* are not—they can be changed in place freely, as can most new objects you’ll code  with classes.

In addition to generic sequence operations, though, strings also have operations all their own, available as *methods*—functions that are attached to and act upon a specific object.

In [20]:
S = 'Spam'

# Find the offset of a substring in S
S.find('pa')

1

In [21]:
S.find?

In [22]:
S

'Spam'

In [23]:
# Replace occurences of a string in S with another
S.replace('pa', 'XYZ')

'SXYZm'

In [24]:
# The original string is unchanged
S

'Spam'

**Other methods:** Split, case conversions, test the content of the string, and strip white space characters off the ends of the string.

In [25]:
line = 'aaa,bbb,cccc,dd'

# split on a delimiter into a list of substrings
line.split(',')

['aaa', 'bbb', 'cccc', 'dd']

In [26]:
S = 'spam'

# Upper- and lowercase conversions
S.upper()

'SPAM'

In [27]:
S

'spam'

In [28]:
# Content tests: isalpha, isdigit, etc.
S.isalpha()

True

In [29]:
line = 'aaa, bbb, cccc, dd\n'

# Remove whitespace characters on the right side
line.rstrip()

'aaa, bbb, cccc, dd'

In [30]:
# Combine two operations
line.rstrip().split(',')

['aaa', ' bbb', ' cccc', ' dd']

**Getting Help:** it returns a list of all the attributes available for any object passed to it.

In [31]:
dir(S)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',


In [32]:
help(S.replace)

Help on built-in function replace:

replace(old, new, count=-1, /) method of builtins.str instance
    Return a copy with all occurrences of substring old replaced by new.
    
      count
        Maximum number of occurrences to replace.
        -1 (the default value) means replace all occurrences.
    
    If the optional argument count is given, only the first count occurrences are
    replaced.



## Lists

The Python list object is the most general sequence provided by the language. Lists are positionally ordered collections of arbitrarily typed objects, and they have no fixed size. They are also mutable—unlike strings.

In [33]:
# A list of three different-type objects
L = [123, 'spam', 1.23]

In [34]:
# Number of items in the list
len(L)

3

We can index, slice, and so on, just as for strings:

In [35]:
# Indexing by position
L[0]

123

In [36]:
# Slicing a list returns a new list
L[:-1]

[123, 'spam']

In [37]:
# Concat/repeat make a new lists too
L + [4, 5, 6]

[123, 'spam', 1.23, 4, 5, 6]

In [38]:
L * 2

[123, 'spam', 1.23, 123, 'spam', 1.23]

In [39]:
# We're not changing the original list
L

[123, 'spam', 1.23]

Further, lists have no fixed *size*. That is, they can grow and shrink on demand, in response to list-specific operations.

In [40]:
# Growing: add object at end of list
L.append('NI')
L

[123, 'spam', 1.23, 'NI']

In [41]:
# Shrinking: delete an item in the middle
L.pop(2)

1.23

In [42]:
# del L[2] deletes from a list too
L

[123, 'spam', 'NI']

Because lists are mutable, most list methods also change the list
object in place, instead of creating a new one:

In [43]:
M = ['bb', 'aa', 'cc']
M.sort()
M

['aa', 'bb', 'cc']

In [46]:
M.sort?

In [45]:
M.sort(reverse=True)
M

['cc', 'bb', 'aa']

In [47]:
M.reverse()
M

['aa', 'bb', 'cc']

In [49]:
help(M.reverse)

Help on built-in function reverse:

reverse() method of builtins.list instance
    Reverse *IN PLACE*.



**Nesting:** We can nest Python's core data types in any combination, and as deeply as we like. One immediate application of this feature is to represent matrices, or ``multidimensional arrays'' in Python.

In [51]:
# A 3 x 3 matrix, as nested lists; code can span lines if bracketed
M = [[1, 2, 3],
     [4, 5, 6],
     [7, 8, 9]]

In [52]:
M

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [53]:
# Get row 2
M[1]

[4, 5, 6]

In [54]:
# Get row 2, then get item 3 within the row
M[1][2]

6

**Comprehensions:** In addition to sequence operations and list methods, Python includes a more advanced operation known as a list comprehension expression, which turns out to be a powerful way to process structures like our matrix.

In [55]:
# Collect the items in column 2
col2 = [row[1] for row in M]

col2

[2, 5, 8]

In [56]:
# The matrix is unchanged
M

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [57]:
# Add 1 to each item in column 2
[row[1] + 1 for row in M]

[3, 6, 9]

In [58]:
# Filter out odd items
[row[1] for row in M if row[1] % 2 == 0]

[2, 8]

In [59]:
# Collect a diagonal from matrix
diag = [M[i][i] for i in [0, 1, 2]]

diag

[1, 5, 9]

In [60]:
# Repeat characters in a string
doubles = [c * 2 for c in 'spam']

doubles

['ss', 'pp', 'aa', 'mm']

In [62]:
[c * 2 for c in str(345)]

['33', '44', '55']

In [63]:
[int(c) * 2 for c in str(345)]

[6, 8, 10]

In [64]:
[int(c) * 2 for c in str(345) + 'a']

ValueError: invalid literal for int() with base 10: 'a'

In [67]:
[int(c) * 2 for c in str(345) + 'a' if c.isdigit()]

[6, 8, 10]

In [70]:
[c for c in (str(345) + 'a').reverse()]

AttributeError: 'str' object has no attribute 'reverse'

In [75]:
 my_list.reverse()

In [76]:
my_list

['a', '5', '4', '3']

In [77]:
my_list = ["3", "4", "5", "a"]
my_list.reverse()
[c for c in my_list]

['a', '5', '4', '3']

In [66]:
c = "4"
c.isdigit()

True

In [78]:
range(4)

range(0, 4)

The following illustrates using **range** —a built-in that generates successive integers, and requires a surrounding list to display all its values in 3.X.

In [79]:
# Generate values from 0 to 3
list(range(4))

[0, 1, 2, 3]

In [80]:
# Generate values from -6 to 6 by 2
list(range(-6, 7, 2))

[-6, -4, -2, 0, 2, 4, 6]

In [81]:
# Multiple values
[[x ** 2, x**3] for x in range(4)]

[[0, 0], [1, 1], [4, 8], [9, 27]]

In [None]:
# Multiple values with "if" filters
[[x, x/2, x * 2] for x in range(-6, 7, 2) if x > 0]

## Dictionaries

Python dictionaries are not sequences at all, but are instead known as mappings. They simply map keys to associated values. Dictionaries, the only mapping type in Python’s core objects set, are also mutable: like lists, they may be changed in place and can grow and shrink on demand.

In [83]:
D = {'food': 'Spam', 'quantity': 4, 'color': 'pink'}

In [84]:
# Fetch value of key 'food'
D['food']

'Spam'

In [85]:
# Add 1 to 'quantity' value
D['quantity'] = D['quantity'] + 1

D['quantity'] += 1

D

{'food': 'Spam', 'quantity': 6, 'color': 'pink'}

You can start with an empty dictionary and fill it out one key at a time.

In [86]:
D = {}

# Create keys by assignment
D['name'] = 'Bob'
D['job'] = 'dev'
D['age'] = 40

D

{'name': 'Bob', 'job': 'dev', 'age': 40}

In [87]:
print(D['name'])

Bob


In other applications, dictionaries can also be used to replace searching operations—indexing a dictionary by key is often the fastest way to code a search in Python.

We can also make dictionaries by passing to the dict type name either keyword arguments (a special name=value syntax in function calls), or the result of zipping together sequences of keys and values obtained at runtime (e.g., from files).

In [88]:
# Keywords
bob1 = dict(name='Bob', job='dev', age=40)

bob1

{'name': 'Bob', 'job': 'dev', 'age': 40}

In [89]:
# Zipping
bob2 = dict(zip(['name', 'job', 'age'], ['Bob', 'dev', 40]))

bob2

{'name': 'Bob', 'job': 'dev', 'age': 40}

In [90]:
[[a, b] for a, b in zip(['name', 'job', 'age'], ['Bob', 'dev', 40])]

[['name', 'Bob'], ['job', 'dev'], ['age', 40]]

**Nesting Revisited**: The following dictionary, coded all at once as a literal, captures more structured information.

In [92]:
rec = {'name': {'first': 'Bob', 'last': 'Smith'},
       'jobs': ['dev', 'mgr'],
       'age': 40.5}

In [93]:
# 'name' is a nested dictionary
rec['name']

{'first': 'Bob', 'last': 'Smith'}

In [94]:
# Index the nested dictionary
rec['name']['last']

'Smith'

In [95]:
# 'jobs' is a nested list
rec['jobs']

['dev', 'mgr']

In [96]:
# Index the nested list
rec['jobs'][-1]

'mgr'

In [97]:
# Expand Bob's job description in place
rec['jobs'].append('janitor')

rec

{'name': {'first': 'Bob', 'last': 'Smith'},
 'jobs': ['dev', 'mgr', 'janitor'],
 'age': 40.5}

The real reason for showing you this example is to demonstrate the flexibility of Python’s core data types. As you can see, nesting allows us to build up complex information structures directly and easily. Building a similar structure in a low-level language like C would be tedious and require much more code: we would have to lay out and
Dictionaries structures and arrays, fill out values, link everything together, and so on.

**Garbage Collection**: Just as importantly, in a lower-level language we would have to be careful to clean up all of the object’s space when we no longer need it. In Python, when we lose the last reference to the object—by assigning its variable to something else, for example—all
of the memory space occupied by that object’s structure is automatically cleaned up for us.

In [98]:
# Now the object's space is reclaimed
rec = 0

**Missing Keys:** Fetching a nonexistent key is a mistake.

In [99]:
D = {'a': 1, 'b': 2, 'c': 3}

D

{'a': 1, 'b': 2, 'c': 3}

In [100]:
# Assigning new keys grows dictionaries
D['e'] = 99

D

{'a': 1, 'b': 2, 'c': 3, 'e': 99}

In [101]:
# Referencing a nonexistent key is an error
D['f']

KeyError: 'f'

In [102]:
'f' in D

False

In [103]:
'e' in D

True

Besides the if test, there are a variety of ways to avoid accessing nonexistent keys in the dictionaries we create: the **get** method, a conditional index with a default.

In [104]:
# Index but with a default
value = D.get('x', 0)

value

0

In [105]:
help(D.get)

Help on built-in function get:

get(key, default=None, /) method of builtins.dict instance
    Return the value for key if key is in the dictionary, else default.



In [106]:
D.get("x")

We can grab a list of keys with the dictionary **keys** method.

In [107]:
# Unordered keys list
Ks = list(D.keys())

Ks

['a', 'b', 'c', 'e']

In [108]:
# Sorted keys list
Ks.sort(reverse=True)

Ks

['e', 'c', 'b', 'a']

In [109]:
# Iterate through sorted keys
for key in Ks:
    print(key, "=>", D[key])

e => 99
c => 3
b => 2
a => 1


In [110]:
[[key, D[key]] for key in Ks]

[['e', 99], ['c', 3], ['b', 2], ['a', 1]]

**sorted** call returns the result and sorts a variety of object types, in this case sorting dictionary keys automatically.

In [111]:
for key in sorted(D):
    print(key, '=>', D[key])

a => 1
b => 2
c => 3
e => 99
