# BMI565: Bioinformatics Programming & Scripting

#### (C) Michael Mooney (mooneymi@ohsu.edu)

## Week 1: Documentation and Data Types

1. Introduction and Why Python?
2. Help, Documentation, Code Organization
    - Help
    - Comments and Docstrings
3. None and Logical Values
4. Numeric Data Types
    - Numeric Operators
5. Sequence Data Types
    - Lists
    - Tuples
    - Sets
    - Strings
    - String Formatting
6. Mapping Data Types
    - Dictionaries
7. A Quick Note on Python 2.7 vs. Python 3

#### Requirements
- Python 2.7
- Miscellaneous Files
    - `sequencing_vs_compute.jpg`

## Introduction and Why Python?

- <b>Bioinformatics:</b> The application of computer science to the fields of biology and medicine
- <b>Programming:</b> The process of designing, writing, testing, debugging, and maintaining a list of computer instructions
- <b>Scripting:</b> Creating a list of computer instructions that call one or more applications

### Computational demands in molecular biology are increasing
<img src="./images/sequencing_vs_compute.jpg" width="500" height="500" align="left" />

### Compiled vs. Interpreted Languages

<table align="left">
<tr><td style="text-align:center"><b>Compiled Languages (e.g. C++)</b></td><td style="text-align:center"><b>Interpreted Languages (e.g. Python)</b></td></tr>
<tr><td style="text-align:center">Code is converted from a high-level language<br /> and runs as machine code</td><td style="text-align:center">Code runs on an interpreter and is converted <br />to machine code one instruction at a time</td></tr>
<tr><td style="text-align:center">Code is machine specific</td><td style="text-align:center">Code is portable; can run on multiple platforms</td></tr>
<tr><td style="text-align:center">Tends to run faster</td><td style="text-align:center">Smaller executable (script) size</td></tr>
<tr><td style="text-align:center">You can distribute standalone executables</td><td style="text-align:center">An interpreter must be installed</td></tr>
<tr><td style="text-align:center">You need to recompile code after making changes</td><td style="text-align:center">No need to recompile</td></tr>
</table>

### Python for Scientific Computing

#### Some Python History
- Created by Guido van Rossum (1989)
- Originally created as a means to automate system admin tasks
    - C programs took too long to write
- van Rossum was a Monty Python fan!
- Python became open source in 1991 (version .9)

#### Scientific Computing

There is a large community of Python developers creating tools for scientific applications. Here are few:

- Numpy and Scipy
    * Tools for working with large data arrays, Optimization, Linear Algebra, etc.
    * [http://www.scipy.org](http://www.scipy.org)
- BioPython
    * Bioinformatics and Computational Biology
    * [http://biopython.org/wiki/Biopython](http://biopython.org/wiki/Biopython)
- Matplotlib
    * Data Presentation, Plotting
    * [http://matplotlib.sourceforge.net/](http://matplotlib.sourceforge.net/)
- Scikit-learn
    * Statistics and Machine Learning
    * [http://scikit-learn.org/stable/](http://scikit-learn.org/stable/)

### Writing and Running a Python Program

A Python program is simply a text file containing commands that can be interpreted by the Python interpreter. It is convention to use the `.py` file extension. A Python program can be created with any text editor, but integrated development environments (IDEs) can make coding easier. IDEs have features such as syntax coloring, automatic indentation, code completion, etc. that can save you time and reduce errors. IDLE is the "official" Python IDE that is packaged with Python installations. Eclipse, Spyder, and Xcode are other examples of IDEs.

Below is an example of a very simple Python program, which simply prints the message "Hello, world!" to the screen. The first line is called a "shebang" or "hashbang", which tells the operating system the location of the Python interpreter. 

    #!/usr/bin/python
    
    print "Hello, world!"

<u>Try it out</u>: Copy and paste the above two lines into a text file and save it as `hello_world.py`. Then run the program by opening a terminal, changing to the the file's location and typing the following at the command-line:

    python hello_world.py

Alternatively, you can call the program directly, without explicitly calling the Python interpreter (this requires that the shebang be present on the first line). First you will need to make the file executable (using the `chmod` command), then you can call the program directly:

    chmod 755 hello_world.py
    ./hello_world.py

## Python Help and Code Documentation

Python's `help()` function can provide documentation about functions, modules, methods, etc. 

<a href="https://docs.python.org/2/">Python's Online Documentation</a> is a great resource. 

And, of course, Google will provide plenty of information, examples, etc.

In [None]:
# For help on any function simply call help() with the function name as the parameter
# For example, what does the len() function do?
help(len)

### Comments:  Please use them!

Any text following a # is ignored by the python interpreter. Surrounding text with """ can be used to create multi-line comments.

Comments can be purely descriptive (e.g. used to explain how code works), but can also serve a purpose when testing or debugging code. For instance, commenting out sections of your code can help you narrow down where an error is occuring.

It is a good idea to write descriptive comments as you code, rather than going back after code is already written. Please use comments thoroughly so it is easy for others to interpret your code. 

All code submitted for homework assignments should include a block of comments at the top of the file that includes the following information:

    # Your Name
    # Assignment Number
    # Date Submitted/Written
    # A short description of what the code does
    # An example of how the script should be run

In [2]:
print "Hello"
#print "Good-bye"

Hello


### Docstrings

Docstrings are a good way to document your code. A docstring is a string literal that occurs as the first statement in a module, function, class or method definition. The info in a docstring will be shown when `help()` is called on the function, module, etc.

In [3]:
def hello_world(n=1):
    """
    This function prints the message "Hello, world!" a specified number of times.
    """
    for i in range(n):
        print "Hello, world!"


In [4]:
hello_world(2)

Hello, world!
Hello, world!


In [5]:
help(hello_world)

Help on function hello_world in module __main__:

hello_world(n=1)
    This function prints the message "Hello, world!" a specified number of times.



There are a number of very useful style guides for creating docstrings. Automated documentation programs, such as [Sphinx](http://sphinx-doc.org/) can parse docstrings that follow these guidelines, and allow you to quickly create web documentation of your programs. I highly recommend you get used to writing thorough docstrings in a standard format, so that your code is well documented.

[Google Style Guide](http://sphinxcontrib-napoleon.readthedocs.org/en/latest/example_google.html#example-google)

[Numpy Style Guide](http://sphinxcontrib-napoleon.readthedocs.org/en/latest/example_numpy.html#example-numpy)

A quick example of a Google-style docstring for a function:

    def hello_world(n=1):
        """
        This function prints the message "Hello, world!" a specified number of times.
        
        Args:
            n (int): The number of times to print the message.
            
        Returns:
            None
        
        """


## Python Data Types

## None and Logical Values

`None` is the Python NULL value. NONE or none will not be interpreted as the `None` value.

`None` is the default return value for Python functions (e.g. when no return value is specified).

In [6]:
x = hello_world()

Hello, world!


In [8]:
if x == None:
    # This usually works
    print "x == None"
if x is not None:
    # But this is the prefered way to compare to None
    print "x is None"


x == None


Logical values are specified with `True` or `False`. Again, these values must be specified exactly as `True` or `False` (e.g. not TRUE or true).

In [9]:
x = True
if x:
    print("True")

if not x:
    print("False")

True


### Logical Operators

<table align="left">
<tr><td style="text-align:center"><b>Operation</b></td><td><b>Description</b></td></tr>
<tr><td style="text-align:center">`x and y`</td><td>`True` if both `x` and `y` are `True`</td></tr>
<tr><td style="text-align:center">`x or y`</td><td>`True` if `x` or `y` is `True`</td></tr>
<tr><td style="text-align:center">`not x`</td><td>`True` if `x` is `False`</td></tr>
<tr><td style="text-align:center">`all(s)`</td><td>`True` only if all elements of `s` are `True`</td></tr>
<tr><td style="text-align:center">`any(s)`</td><td>`True` if any elements of `s` are `True`</td></tr>
</table>

In [10]:
True and False or (True and not False)

True

In [11]:
## The all function returns True if all elements are True
all([True, True, False])

False

In [12]:
## The any() function returns True if any elements are True
any([False, False, True])

True

## Numeric Data Types

- `int` - Integer, 32-bit precision, long integer in C <br />
- `float` - Floating point, 64-bit precision, double in C <br />
- `long` - Long integer: unlimited precision<br />
- `complex` - Imaginary numbers: these have `.real` and `.imag` components

### Numeric Operators

<table align="left">
<tr><td style="text-align:center"><b>Operator</b></td><td style="text-align:center"><b>Example</b></td><td><b>Description</b></td></tr>
<tr><td style="text-align:center">`*`</td><td style="text-align:center">`x * y`</td><td>Multiplication of `x` and `y`</td></tr>
<tr><td style="text-align:center">`/`</td><td style="text-align:center">`x / y`</td><td>Quotient of `x` and `y`</td></tr>
<tr><td style="text-align:center">`//`</td><td style="text-align:center">`x // y`</td><td>Floored quotient of `x` and `y`</td></tr>
<tr><td style="text-align:center">`%`</td><td style="text-align:center">`x % y`</td><td>Remainder of `x / y`</td></tr>
<tr><td style="text-align:center">`**`</td><td style="text-align:center">`x ** y`</td><td>`x` to the power `y`</td></tr>
<tr><td style="text-align:center">`+`</td><td style="text-align:center">`x + y`</td><td>Addition of `x` and `y`</td></tr>
<tr><td style="text-align:center">`-`</td><td style="text-align:center">`x - y`</td><td>Difference of `x` and `y`</td></tr>
<tr><td style="text-align:center">`-`</td><td style="text-align:center">`-x`</td><td>Negation of `x`</td></tr>
</table>

### Other Numeric Operations
<table align="left">
<tr><td style="text-align:center"><b>Operation</b></td><td><b>Description</b></td></tr>
<tr><td style="text-align:center">`abs(x)`</td><td>Absolute value of `x`</td></tr>
<tr><td style="text-align:center">`pow(x, y)`</td><td>`x` to the power `y`</td></tr>
<tr><td style="text-align:center">`divmod(x)`</td><td>The pair `(x / y, x % y)`</td></tr>
<tr><td style="text-align:center">`int(x)`</td><td>`x` converted to integer</td></tr>
<tr><td style="text-align:center">`long(x)`</td><td>`x` converted to long</td></tr>
<tr><td style="text-align:center">`float(x)`</td><td>`x` converted to float</td></tr>
<tr><td style="text-align:center">`complex(re,im)`</td><td>A complex number with real part `re` and imaginary part `im` (defaults to 0)</td></tr>
<tr><td style="text-align:center">`c.conjugate(c)`</td><td>The conjugate of the complex number `c`</td></tr>
</table>

In [13]:
5/6

0

In [15]:
float(5)/6

0.8333333333333334

### Operator Precedence

1. `()`
2. `**`
3. `+/-` (negation)
4. `*,/,//,%`
5. `+,- ` (addition and subtraction)
6. `not`
7. `and`
8. `or`

### Comparison Operators
<table align="left">
<tr><td style="text-align:center"><b>Operator</b></td><td style="text-align:center"><b>Example</b></td><td><b>Description</b></td></tr>
<tr><td style="text-align:center">`<`</td><td style="text-align:center">`x < y`</td><td>`x` less than `y`</td></tr>
<tr><td style="text-align:center">`>`</td><td style="text-align:center">`x > y`</td><td>`x` greater than `y`</td></tr>
<tr><td style="text-align:center">`==`</td><td style="text-align:center">`x == y`</td><td>`x` equals `y`</td></tr>
<tr><td style="text-align:center">`<=`</td><td style="text-align:center">`x <= y`</td><td>`x` less than or equal to `y`</td></tr>
<tr><td style="text-align:center">`>=`</td><td style="text-align:center">`x >= y`</td><td>`x` greater than or equal to `y`</td></tr>
<tr><td style="text-align:center">`!=`</td><td style="text-align:center">`x != y`</td><td>`x` not equal to `y`</td></tr>
</table>

## Sequence Data Types

### Lists

Lists are ordered collections of data. Python lists are mutable data types, which means the individual elements of a list can be modified.

#### List methods
<table align="left">
<tr><td style="text-align:center"><b>Method</b></td><td><b>Description</b></td></tr>
<tr><td style="text-align:center">`l.append(x)`</td><td>Appends `x` to the end of the list `l`</td></tr>
<tr><td style="text-align:center">`l.extend(y)`</td><td>Appends each element of the list `y` to the end of `l`.</td></tr>
<tr><td style="text-align:center">`l.insert(i, x)`</td><td>Inserts `x` at index `i`</td></tr>
<tr><td style="text-align:center">`l.sort()`</td><td>Sorts the list `l`</td></tr>
<tr><td style="text-align:center">`l.pop([i])`</td><td>Removes the element at position `i` and returns it. If `i` is not given, the last element will be removed.</td></tr>
<tr><td style="text-align:center">`l.reverse()`</td><td>Reverses the list `l`</td></tr>
<tr><td style="text-align:center">`l.index(x)`</td><td>Returns the index of `x`</td></tr>
<tr><td style="text-align:center">`l.remove(x)`</td><td>Removes the first instance of `x` in the list</td></tr>
</table>

\*\* Note: Methods such as `list.append()`, `list.extend()`, etc. do not return the updated list. They return `None`. So be careful about assigning the results of these methods to a variable and expecting a list.

see [https://docs.python.org/2/tutorial/datastructures.html](https://docs.python.org/2/tutorial/datastructures.html) for more details

In [16]:
help(list.sort)

Help on method_descriptor:

sort(...)
    L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*;
    cmp(x, y) -> -1, 0, 1



In [17]:
# Lists can contain multiple different data types
list1 = [1,2,3.5,'A','B','C']
list1

[1, 2, 3.5, 'A', 'B', 'C']

In [18]:
x = list1.append('D')
x is None

True

In [19]:
list1

[1, 2, 3.5, 'A', 'B', 'C', 'D']

In [20]:
## Use the len() function to get the length of any 
## sequence variable (list, string, tuple, etc.)
len(list1)

7

In [21]:
# You can access list elements with an index (starting with 0)
list1[0]

1

In [22]:
# You can access a subset of a list using slice notation
# The first index is inclusive, and the second is exclusive, 
# so [0:3] will give you the first, second and third elements
list1[0:3]

[1, 2, 3.5]

In [29]:
# You can also specify a 'step' to get, for instance,
# every second element in the list. [0:-1:2] will give you
# the first, third, fifth, etc. element excluding the last 
# element of the list [-1].
list1[0:-1:2]

[1, 3.5, 'B']

In [26]:
list1[-2]

'C'

In [30]:
# To specify the entire list don't enter start or end indices
list1[:]

[1, 2, 3.5, 'A', 'B', 'C', 'D']

### Copying Lists

When you assign a list variable to another variable, you are simply creating a reference to the first list. This means that if the values of the first list are changed, so are the values of the second list. To create a new list that is a copy of another list, you should use the `list()` function or slice notation (see below). The `copy.deepcopy()` method can be used to copy lists of objects.

In [31]:
# Create a new list
A = [1,2,3,4]
# Assign list A to a new variable
B = A
B

[1, 2, 3, 4]

In [32]:
# Create a copy of list A
C = list(A)
C

[1, 2, 3, 4]

In [33]:
# Another way to create a copy of list A
D = A[:]
D

[1, 2, 3, 4]

In [34]:
# Change the value of the third element in list A
A[2] = 16
A

[1, 2, 16, 4]

In [35]:
# The value of list B changes also! List B is simply a reference to list A.
B

[1, 2, 16, 4]

In [36]:
# List C is a new list (a copy of A) and its values did not change
C

[1, 2, 3, 4]

In [37]:
# Same for list D
D

[1, 2, 3, 4]

In [38]:
E = [A, C]
E

[[1, 2, 16, 4], [1, 2, 3, 4]]

In [43]:
E[0][0]

1

In [39]:
# When using list() to create a copy of a list that contains objects 
# a new list will be created, but the elements of the list will
# still be references to the objects
F = list(E)

In [40]:
# Using the deepcopy() method will create a copy of a list and any 
# objects (e.g. other lists) that are elements of that list
import copy
G = copy.deepcopy(E)
G

[[1, 2, 16, 4], [1, 2, 3, 4]]

In [41]:
# Change the last element of list C
C[-1] = 20
C

[1, 2, 3, 20]

In [44]:
F

[[1, 2, 16, 4], [1, 2, 3, 20]]

In [45]:
G

[[1, 2, 16, 4], [1, 2, 3, 4]]

### Sets

Sets are unordered collections of unique (non-duplicate) items. Sets are mutable, and are very useful for comparing collections of items.

#### Set Methods

<table align="left">
<tr><td style="text-align:center"><b>Method</b></td><td style="text-align:center"><b>Equivalent To</b></td><td><b>Description</b></td></tr>
<tr><td style="text-align:center">`s.union(x)`</td><td style="text-align:center">`s | x`</td><td>Returns a new set containing the elements of `s` and the elements of `x`</td></tr>
<tr><td style="text-align:center">`s.intersection(x)`</td><td style="text-align:center">`s & x`</td><td>Returns a new set containing only the elements in both `s` and `x`</td></tr>
<tr><td style="text-align:center">`s.difference(x)`</td><td style="text-align:center">`s - x`</td><td>Returns a new set containing elements in `s` but not in `x`</td></tr>
<tr><td style="text-align:center">`s.symmetric_difference(x)`</td><td style="text-align:center">`s ^ x`</td><td>Returns a set containing elements in either `s` or `x`, but not both</td></tr>
<tr><td style="text-align:center">`s.issubset(x)`</td><td style="text-align:center">`s <= x`</td><td>Returns `True` if `x` contains all elements of `s`</td></tr>
<tr><td style="text-align:center">`s.issuperset(x)`</td><td style="text-align:center">`s >= x`</td><td>Returns `True` if `s` contains all elements of `x`</td></tr>
</table>

#### Other Set Operations
<table align="left">
<tr><td style="text-align:center"><b>Operation</b></td><td><b>Description</b></td></tr>
<tr><td style="text-align:center">`s.add(y)`</td><td>Adds the value `y` to the set `s`</td></tr>
<tr><td style="text-align:center">`s.remove(y)`</td><td>Removes the value `y` from `s`. Returns an error if `y` is not in `s`.</td></tr>
<tr><td style="text-align:center">`s.discard(y)`</td><td>Removes the value `y` from `s` if present</td></tr>
<tr><td style="text-align:center">`s.update(z)`</td><td>Updates the set `s` by adding all elements from set `z`</td></tr>
<tr><td style="text-align:center">`s.clear()`</td><td>Removes all elements from `s`</td></tr>
</table>

see <a href="https://docs.python.org/2/library/stdtypes.html#set">https://docs.python.org/2/library/stdtypes.html#set</a> for more details

In [46]:
## Create a set using the set() function
s1 = set(['blue', 'yellow', 'red'])
s2 = set(['orange', 'green', 'blue'])

## Curly braces can also be used
s3 = {'pink', 'purple'}

In [47]:
## Get the intersection of two sets
s1 & s2

{'blue'}

### Tuples
Tuples are immutable sequences. Tuples are similar to lists, but you can't change individual elements.

In [48]:
## Tuples are created by placing parentheses around 
## a sequence of values separated by commas
t1 = (1,2,3,4)
t1

(1, 2, 3, 4)

In [49]:
## To create a tuple with a single value, place a comma at the end
t2 = ('A',)
t2

('A',)

In [50]:
## You can't change the elements of a tuple
t1[0] = 3

TypeError: 'tuple' object does not support item assignment

### Strings

Strings are immutable sequences of characters, and can accessed much like lists. There are, of course, numerous methods that are specific to strings. Below are a few useful string methods

#### String Methods
<table align="left">
<tr><td style="text-align:center"><b>Operation</b></td><td><b>Description</b></td></tr>
<tr><td style="text-align:center">`str.find(sub)`</td><td>Returns the lowest index where substring `sub` is found in `str`</td></tr>
<tr><td style="text-align:center">`str.replace(old, new)`</td><td>Returns a copy of `str` with all occurences of substring `old` replaced by `new`.</td></tr>
<tr><td style="text-align:center">`str.split([sep])`</td><td>Returns a list containing substrings of `str` using `sep` as the delimiter.</td></tr>
<tr><td style="text-align:center">`str.join(iterable)`</td><td>Returns a string that is a concatenation of the elements in<br /> `iterable`. The separator between elements is `str`.</td></tr>
<tr><td style="text-align:center">`str.strip([chars])`</td><td>Returns a copy of 'str' with leading and trailing `chars` removed. <br />If `chars` is not specified, whitespace will be removed.</td></tr>
</table>

see [https://docs.python.org/2/library/stdtypes.html#string-methods](https://docs.python.org/2/library/stdtypes.html#string-methods) for more details

In [51]:
## Define a string by placing quotes around a sequence of characters
s1 = "hello"
s1

'hello'

In [52]:
## Use the str() function to convert values to string
str(5)

'5'

In [53]:
## Modify the string by replacing a substring
s2 = s1.replace('h', 'H')

In [54]:
s2

'Hello'

In [55]:
## Create a list containing a string's characters
list(s1)

['h', 'e', 'l', 'l', 'o']

### Operations for Sequence Data Types

<table align="left">
<tr><td style="text-align:center"><b>Operation</b></td><td><b>Description</b></td></tr>
<tr><td style="text-align:center">`s + y`</td><td>Concatenates `s` and `y`</td></tr>
<tr><td style="text-align:center">`s * n`</td><td>Returns `n` copies of `s`</td></tr>
<tr><td style="text-align:center">`v1, v2, v3 = s`</td><td>Variable unpacking</td></tr>
<tr><td style="text-align:center">`x in s` <br /> `x not in s`</td><td>Determine membership. Returns `True` or `False`.</td></tr>
<tr><td style="text-align:center">`len(s)`</td><td>Returns the length of `s`.</td></tr>
<tr><td style="text-align:center">`min(s)`</td><td>Returns the minimum value in `s`.</td></tr>
<tr><td style="text-align:center">`max(s)`</td><td>Returns the maximum value in `s`.</td></tr>
<tr><td style="text-align:center">`sum(s)`</td><td>Returns the sum of `s` (elements of `s` must be numeric).</td></tr>
</table>

In [56]:
s1 + ' ' + s2

'hello Hello'

In [57]:
'A' in s1

False

### String Formatting

The `%` operator can be used to control the formatting of string. This is useful for creating nice-looking output. The format is as follows:

    "<some-string> %<modifier><conversion-specifier>" % (<a tuple of values>)

#### Conversion Specifiers
<table align="left">
<tr><td style="text-align:center"><b>Character</b></td><td><b>Output Format</b></td></tr>
<tr><td style="text-align:center">`d` or `i`</td><td>Integer decimal</td></tr>
<tr><td style="text-align:center">`o`</td><td>Octal</td></tr>
<tr><td style="text-align:center">`x`</td><td>Hexadecimal</td></tr>
<tr><td style="text-align:center">`f`</td><td>Floating point decimal</td></tr>
<tr><td style="text-align:center">`e`</td><td>Floating point exponential format</td></tr>
<tr><td style="text-align:center">`s`</td><td>String</td></tr>
</table>

#### Formatting Modifiers
1. A number specifying minimum field width
2. A `.` separating the field width from the precision number
3. A precision number specifying the number of characters to be printed from a string (or the number of digits after the decimal point for a floating point number)

In [58]:
## Create string using the % operator
s2 = "My favorite number is %d" % (5,)
s2

'My favorite number is 5'

In [59]:
## More complicated formatting
s3 = "This is %s displayed to 5 decimal points: '%10.5f'" % ("pi", 3.14159265359)
s3

"This is pi displayed to 5 decimal points: '   3.14159'"

#### Another (Newer) Way to Format Strings

The `str.format()` method can also be used to format strings, using similar specifiers and modifiers as described above. In this case, however, placeholders in your string are denoted by curly braces (`{}`), and `'%'` is replaced with `':'`. The general format for the placeholder (or replacement field) is as follows:

    { [field_name] [! conversion] [: format_spec] }

[https://docs.python.org/2/library/string.html#formatstrings](https://docs.python.org/2/library/string.html#formatstrings)

In [60]:
## Format a string using the format() method
s4 = "This is {} displayed to 5 decimal points: '{:10.5f}'"
s4.format("pi", 3.14159265359)

"This is pi displayed to 5 decimal points: '   3.14159'"

#### Escape Characters
<table align="left">
<tr><td style="text-align:center"><b>Character</b></td><td><b>Output</b></td></tr>
<tr><td style="text-align:center">`\\`</td><td>Backslash</td></tr>
<tr><td style="text-align:center">`\'`</td><td>Single quote</td></tr>
<tr><td style="text-align:center">`\"`</td><td>Double quote</td></tr>
<tr><td style="text-align:center">`\n`</td><td>Newline</td></tr>
<tr><td style="text-align:center">`\t`</td><td>Tab</td></tr>
</table>

In [61]:
print "Line1\nLine2"

Line1
Line2


## Mapping Data Types

Dictionaries are unordered collections of `key:value` pairs. Keys must be hashable values (immutable data types). 

#### Dictionary Methods
<table align="left">
<tr><td style="text-align:center"><b>Method</b></td><td><b>Description</b></td></tr>
<tr><td style="text-align:center">`d.get(key[, default])`</td><td>Returns the value associated with `key` in `d`. If `key` doesn't exist, returns `default`, or `None` if `default` is not specified.</td></tr>
<tr><td style="text-align:center">`d.keys()`</td><td>Returns a list of the keys of `d`</td></tr>
<tr><td style="text-align:center">`d.values()`</td><td>Returns a list of the values of `d`</td></tr>
<tr><td style="text-align:center">`d.items()`</td><td>Returns a list of the keys:value pairs as tuples.</td></tr>
<tr><td style="text-align:center">`d.iteritems()`</td><td>Returns an iterator over the dictionary's `(key, value)` pairs.</td></tr>
<tr><td style="text-align:center">`d.update(y)`</td><td>Updates `d` with key:value pairs of `y` overwriting existing keys</td></tr>
<tr><td style="text-align:center">`d.pop(key[,default])`</td><td>If `key` is in `d` remove it and return its value. If the key does not exist, returns `default`, if provided, otherwise an error.</td></tr>
</table>

See [https://docs.python.org/2/library/stdtypes.html#mapping-types-dict](https://docs.python.org/2/library/stdtypes.html#mapping-types-dict) for more details

In [62]:
## Create a dictionary by placing curly brackets {}
## around a comma delimited list of key:value pairs
d1 = {'A':'blue', 'B':'red'}
d1

{'A': 'blue', 'B': 'red'}

In [63]:
## Or call the dict() function on a list of tuples
d2 = dict([('A', 'blue'), ('B', 'red')])
d2 == d1

True

In [64]:
## Get all keys and values as a list of tuples
d1.items()

[('A', 'blue'), ('B', 'red')]

In [65]:
## Access the dictionary's values
d1['A']

'blue'

In [66]:
d1.get('A')

'blue'

## Python 2.7 vs. Python 3

We will be using Python 2.7 in this course. The reason we use Python 2.7 is mainly for the sake of compatibility. It is still the default version installed on many systems and some third-party packages are not yet available for Python 3. However, Python 3 is over five years old now. It's important to understand what's new, so you'll be ready for the future.

A good way to start getting your code ready for Python 3, while still using 2.7 is to make use of the [future compatibility package](http://python-future.org/index.html). For example, the following code imports two features that make Python 3 different from Python 2.7: the print function (no longer a print statement), and the division operator (no longer integer division).

In [67]:
## Example of the printing in Python 2.7 -- print is a statement not a function
print "Hello World!"

Hello World!


In [68]:
## Example of printing in Python 3
from __future__ import print_function
print("Hello World!")

Hello World!


In [69]:
## What happens if I try the print statement now?
print "Hello World!"

SyntaxError: invalid syntax (<ipython-input-69-5fcbae8c4c7b>, line 2)

In [70]:
## Example of division in Python 2.7 -- integer division
7/2

3

In [71]:
## First convert one of the numbers to a floating point number
float(7)/2

3.5

In [72]:
## Example of division in Python 3
from __future__ import division
7/2

3.5

#### ** All code for assignments in this class must run under Python 2.7. However, it is up to you whether you want to use the `__future__` package, as in the examples above, to get used to Python 3 syntax.

#### You will see mostly 2.7 syntax in the class materials. 

## References

- <u>Programming Languages</u>, Ravi Sethi, 2nd Edition, Addison‐Wesley (1996)
- <u>Problem Solving, Abstraction and Design Using C++</u>, Frank Friedman, Elliot Koffman, 4th Edition, Addison-Wesley (2004)
- <u>Python for Bioinformatics</u>, Sebastian Bassi, CRC Press (2010)

#### Last Updated: 26-Jan-2016