Ricardo Duarte, [Python for Developers](http://ricardoduarte.github.io/python-for-developers/#content "Python for Developers on GitHub")
==========================

**Adapted by [Arthur Goldberg](https://www.mountsinai.org/profiles/arthur-p-goldberg) for the [Biomedical Software Engineering](https://learn.mssm.edu/webapps/blackboard/content/listContentEditable.jsp?content_id=_448512_1&course_id=_5776_1 "Biomedical Software Engineering Blackboard site") course at the Mount Sinai School of Medicine**

Chapter 5: Types
=============================

A Python *type* defines the characteristics of an *object* (more about objects later) in a Python program. Python has built-in types, and custom, programmer-defined types. *Every* object has a specific type. Example [built-in types](https://docs.python.org/3/library/stdtypes.html) include integers, strings, lists, sets, programs, and type itself. We will learn how to define custom types when we study Python objects.

Objects in Python are frequently created. E.g., this statement creates a string object which contains `hello, world`, and stores a reference to it in the variable `hi`. Using `hi` accesses the string.

In [1]:
hi = 'hello, world'
print('hi.upper():', hi.upper())

hi.upper(): HELLO, WORLD


(If you're wondering what happens to all of the objects that get created - don't they fill up memory? - be reassured that objects which cannot be used anymore are deleted, in a process called *garbage collection*.)

Variable names must start with a letter or underscore (`_`) and be followed by letters, digits or underscores.  They may not contain any other characters. Uppercase and lowercase letters are considered different. 

Python has many built-in, standard types, including:

+ Numeric (integer, float, complex, ... )
+ Strings

Numerics
--------

In [2]:
# integers
i = 3
j = int(3)  # make an integer from 3; this is redundant
k = int('3') # convert string '3' to an int
# what does this do?
# l = int('hi')

# floats
x = 3.14
y = 22/7  # the results of math expressions that are not integers are floats
print('x, y:', x, y)

# scientific notation is supported
print('NA ~= ', 6.02E23)
# unless a value can be exactly represented in binary, a float is approximate
print('0.1 + 0.2 =', 0.1 + 0.2)

x, y: 3.14 3.142857142857143
NA ~=  6.02e+23
0.1 + 0.2 = 0.30000000000000004


Data structures
-----------
Python provides a wonderful, extensive set of powerful, high-performance, built-in data structures. These include:

+ List
+ Tuple
+ Set
+ String
+ Dictionary

These types support many operations. Some operations, such as [truth testing](https://docs.python.org/3/library/stdtypes.html#truth-value-testing), are supported by all objects including these, and others, such as 'contains' are only supported by a some of them.

Lists and tuples
-----

In [3]:
# lists and tuples
# these are both sequences that support common sequence operations, but lists can be changed whereas tuples cannot
list_1 = [1, 2, 3, 10]
tuple_1 = (1, 2, 3, 10)
print('1 in list_1:', 1 in list_1)
print('1 in tuple_1:', 1 in tuple_1)
print('list_1[2]:', list_1[2])
print('tuple_1[2]:', tuple_1[2])

list_1.append(3)
# this fails:
# tuple_1.append(3)
# try it; how can we handle the failure so that the rest of this cell executes?

# -1 indicates the last index; negative indices count from the end of the list
print('\nlist_1[-1]:', list_1[-1])
print('tuple_1[-1]:', tuple_1[-1])

# Lists can be converted into tuples:
print('\ntuple(list_1):', tuple(list_1))

# And tuples can be converted into lists:
print('list(tuple_1):', list(tuple_1))

1 in list_1: True
1 in tuple_1: True
list_1[2]: 3
tuple_1[2]: 3

list_1[-1]: 3
tuple_1[-1]: 10

tuple(list_1): (1, 2, 3, 10, 3)
list(tuple_1): [1, 2, 3, 10]


Note that tuples are printed (converted to a string) in parens '(1, 2, 3, 10, 3)', and the string representation of a list is enclosed in square brackets '[...]'. `str` is the function that does this conversion. Many objects support a `str` function.

In [4]:
str(list_1)

'[1, 2, 3, 10, 3]'

In [5]:
str(tuple_1)

'(1, 2, 3, 10)'

The types list and tuple, like all other types we mention in this notebook, support [many operations](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range) we do not describe. I count 11 operations shared by both, and 14 more operations that modify lists.

Mutability
---
Above we saw that lists can be modified, whereas tuples cannot. More generally, some Python  objects can be modified (are *mutable*) and some cannot (are *immutable*). Examples include:

+ mutable types: list, set, dictionary
+ immutable types: integer, float, string, tuple

Later we will see that immutable types serve a special purpose for dictionaries and sets.

Strings
-----
A *string* is a Python built-in that handles text.

Standard strings represent English and many characters in some Germanic languages. They use 8-bit characters; unicode strings enable the representation of broad, international alphabets, and use 32-bit characters, 4 byte *code words*.

In [6]:
# standard string
s = 'Led Zeppelin'
print(s)
# Unicode string
u = u'Björk'
print(u)

Led Zeppelin
Björk


A *literal* is an explicit object written in a program. We used some integer and float literals above. String *literals* can be provided in multiple ways:

In [7]:
# in single or double quotes
literal_examples = [
    'string in single quotes',
    "in double quotes: don't say it!",
    '''triple quotes provide multi-
line strings!
e e cummings loves them''',
    """triple double quotes are also multi-line, and can embed triple single quotes, like this: ''' """,
    "also, quote characters can be escaped with backslash: \"stop\", he yelled!",
    "to use backslash (\\), escape it like that!",
    "join " 'adjacent ' '''string literals separated by whitespace '''
    '-- tada!',
    "what is whitespace?"
]
for example in literal_examples:
    print('==', example, '==')

== string in single quotes ==
== in double quotes: don't say it! ==
== triple quotes provide multi-
line strings!
e e cummings loves them ==
== triple double quotes are also multi-line, and can embed triple single quotes, like this: '''  ==
== also, quote characters can be escaped with backslash: "stop", he yelled! ==
== to use backslash (\), escape it like that! ==
== join adjacent string literals separated by whitespace -- tada! ==
== what is whitespace? ==


As they are immutable, strings cannot be changed. To modify a string, Python reads an existing string and creates a new string with modified characters.

In [8]:
s = 'Camel'

# Concatenation
print('The ' + s + ' ran away!')

# Format with variables
print("'{}' contains {} chars".format(s, len(s)))

# String processed as a sequence
for ch in s: print(ch)

# Strings are objects
# Test initial characters, and convert to uppercase
if s.startswith('C'): print(s.upper())

# what will happen? 
print(3 * s)
# n * s concatenates n copies of s

The Camel ran away!
'Camel' contains 5 chars
C
a
m
e
l
CAMEL
CamelCamelCamel


I use these strings methods regularly: `endswith`, `startswith`, `format` most frequently, `index`, `join`, `split`, and `strip`.

I count 45 string methods in the [standard library](https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str) which extend the dozen methods available to all sequence types. Don't try to memorize them -- rather, learn how to search for a method that suits your needs.

Sets
---
A set contains unique elements, which may be of any type. Sets are mutable, but their elements must be immutable.

In [28]:
# sets
# construct sets from a list
s0 = set([0])
s1 = set([0, 1])
print('s1:', s1)
# add the element 'x'
s1.add('x')
print("'s1 after adding 'x':", s1)

# Union
# note set literal in { }
s2 = s1.union({1, "bye"})
print('union s1 and {1, "bye"}:', s2)

# delete
s2.discard(1)
print('deleted 1:', s2)

# difference
print('difference s2-s0:', s2-s0)

s1: {0, 1}
's1 after adding 'x': {0, 1, 'x'}
union s1 and {1, "bye"}: {0, 1, 'x', 'bye'}
deleted 1: {0, 'x', 'bye'}
difference s2-s0: {'x', 'bye'}


When a list is converted to a *set*, duplicate elements are discarded. Sets support the [obvious](https://docs.python.org/3/tutorial/datastructures.html#sets) operations of intersection, difference, union, add element, delete element, contains, convert to list, etc. Other [operations](https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset) include issubset, issuperset, pop, and clear.


Dictionaries
-----------
A dictionary is a mapping from keys to corresponding objects. Dictionaries are mutable, like lists. Sets and dictionaries use *hash* tables, so keyed lookups cost O(1).

All dictionary keys must be immutable. (Why might mutable keys be problematic?) Any immutable object can be a key, but typical keys are strings, integers, and tuples of these. Iterating over a dictionary can return its elements in any order.

Dictionary literals look like:

    d = {key_1:obj_1, key_2:obj_2, ..., key_n:obj_n}

In [29]:
capitals = {
    'Albany': 'NY',
    'Boston': 'MA',
    'Hartford': 'CT'
}
print('\nUse items() to iterate over the items in a dict')
for key, value in capitals.items():
    print(key, 'is the capital of', value)

# dictionaries can also be constructed with dict()
# it takes a sequence of key-value pairs:
western_capitals = dict(
    [('Phoenix', 'Arizona'),
    ('Sacramento', 'California'),
    ('Olympia', 'Washington')]
)
print(western_capitals)

# dict also takes keyword arguments:
western_capitals = dict(
    Phoenix='Arizona',
    Sacramento='California',
    Olympia='Washington'
)
print(western_capitals)


Use items() to iterate over the items in a dict
Albany is the capital of NY
Boston is the capital of MA
Hartford is the capital of CT
{'Phoenix': 'Arizona', 'Sacramento': 'California', 'Olympia': 'Washington'}
{'Phoenix': 'Arizona', 'Sacramento': 'California', 'Olympia': 'Washington'}


In [30]:
# more dict operations
print('\nkeys() provides an iterator over the keys in a dictionary:')
print(western_capitals.keys())

print('\nand values() provides an iterator over the values:')
print(western_capitals.values())

# entries are addressed via []s:
# they can be retrieved
print('\nOlympia is the capital of', western_capitals['Olympia'])

# and assigned
western_capitals['Juneau'] = 'Alaska'
print('\nsorted(western_capitals.values() (with Alaska):', sorted(western_capitals.values()))

# and deleted
del western_capitals['Sacramento']
print("\n'Sacramento' in western_capitals", 'Sacramento' in western_capitals)


keys() provides an iterator over the keys in a dictionary:
dict_keys(['Phoenix', 'Sacramento', 'Olympia'])

and values() provides an iterator over the values:
dict_values(['Arizona', 'California', 'Washington'])

Olympia is the capital of Washington

sorted(western_capitals.values() (with Alaska): ['Alaska', 'Arizona', 'California', 'Washington']

'Sacramento' in western_capitals False


Booleans
------------------------

### Boolean constants: True and False
In Python, the boolean type (*bool*) takes the values `True` and `False`.

In [11]:
print('Boolean type:', type(True))

Boolean type: <class 'bool'>


### Boolean values for data structures

In [12]:
# Empty data structures are False; these 
# Empty literals
for example_type, empty_literal in [
    ('int', 0),
    ('str', ""),
    ('list', []),
    ('tuple', ()),
    ('dict', {}),
    ('set', set()),
    ('range', range(0))]:
    empty_literal_boolean = True if empty_literal else False
    print(example_type, empty_literal_boolean)

int False
str False
list False
tuple False
dict False
set False
range False


In [13]:
# Non-empty data structures are True
for example_type, non_empty_literal in [
    ('int', 1),
    ('str', "x"),
    ('list', [1]),
    ('tuple', (0)),
    ('dict', {1:2}),
    ('set', {'hi'}),
    ('range', range(3))]:
    non_empty_literal_boolean = True if non_empty_literal else False
    print(example_type, non_empty_literal_boolean)

int True
str True
list True
tuple False
dict True
set True
range True


### Boolean Operations

Use Boolean operators to construct logical expressions that direct control flow. Because all types have truth values, any object can be incorporated in a boolean expression.

The (Boolean operators in Python)[https://docs.python.org/3/library/stdtypes.html#boolean-operations-and-or-not] are: `and`, `or`, and `not`.

These are *short-circuit operators* which do not make unnecessary operations:
+ `and`: if the first argument is false, do not evaluate the second argument
+ `or` : if the first argument is true, do not evaluate the second argument

Boolean operators do not convert their arguments to Boolean values:
+ `and`: if the first argument is false, return it, otherwise return the second argument
+ `or` : if the first argument is true, return it, otherwise return the second argument

Examples:

In [14]:
print('0 and 3:', 0 and 3) # computes 0
print('2 and [3]:', 2 and [3]) # computes [3]

print('0 or [3]:', 0 or [3]) # computes [3]
print("'first' or False:", 'first' or False) # computes 'first'

0 and 3: 0
2 and [3]: [3]
0 or [3]: [3]
'first' or False: first


### Practical uses
This behavior makes Boolean operators useful for selecting among objects. E.g.,

    `error_messages or data`

will evaluate to the error messages if it is true (not an empty list) otherwise provide the data.

On the other hand, `not` does return Boolean values:

In [15]:
print('not 0:', not 0) # Shows True
print('not [5]:', not [5]) # Shows False


not 0: True
not [5]: False


In [16]:
%%html
<style>
table {float:left}
</style>

Comparisons
-----------
The Python [comparison operations](https://docs.python.org/3/library/stdtypes.html#comparisons) return Boolean values. They are:

|Operation | Meaning
| :---: | :---:
\< | strictly less than
\<= | less than or equal
\> | strictly greater than
\>= | greater than or equal
== | equal
!= | not equal
is | object identity
is not | negated object identity



In [17]:
print("'2 < 3':", 2 < 3)
print('\nPython nicely chains comparisons:')
print("'2 < 3 <= 2':", 2 < 3 <= 2)

print('\n== evaluates whether 2 objects have the same value:')
print("'2 == 2':", 2 == 2)
x = [3]
y = [3]
print("'x == y':", x == y)

print("\n'is' evaluates whether 2 objects are the same object:")
print("'x is y':", x is y)
print("'x is x':", x is x)
print("'x is not y':", x is not y)

'2 < 3': True

Python nicely chains comparisons:
'2 < 3 <= 2': False

== evaluates whether 2 objects have the same value:
'2 == 2': True
'x == y': True

'is' evaluates whether 2 objects are the same object:
'x is y': False
'x is x': True
'x is not y': True


### None
`None` is a built-in value that represents no information.

In [18]:
print(None)
print('None has a special type: type(None):', type(None))

print("\n'None only equals None:")
print('Test equality: 3 == None:', 3==None)
print('None == None:', None == None)
print('None is None:', None is None)

None
None has a special type: type(None): <class 'NoneType'>

'None only equals None:
Test equality: 3 == None: False
None == None: True
None is None: True
