# Lecture Review 2/16/16

## Dictionaries

Dictionaries provide one-to-one mappings from key to value. They are incredibly useful when you have common IDs, like gene symbols, or accession numbers.

* You can initialize a dictionary using the dictionary literal syntax

In [6]:
Dict = {'first_name':'Edward','last_name':'Jenner'}

In [7]:
print(Dict)

{'first_name': 'Edward', 'last_name': 'Jenner'}


* You can also create a dictionary dynamically

In [1]:
Dict = {}

In [2]:
Dict['first_name']='Edward'

In [10]:
Dict['last_name']='Jennar'

In [11]:
print(Dict)

{'first_name': 'Edward', 'last_name': 'Jennar'}


* You can access the values stored in the dictionary by using the keys

In [5]:
Dict['first_name']

'Edward'

* You can also update the values in a dictionary, if you want to add more information or have made a mistake

In [12]:
Dict['last_name']='Jenner'

In [13]:
print(Dict)

{'first_name': 'Edward', 'last_name': 'Jenner'}


In [14]:
Dict['birth_year']=1749

In [15]:
print(Dict)

{'first_name': 'Edward', 'last_name': 'Jenner', 'birth_year': 1749}


* You can even create nested dictionaries

In [14]:
english_scientists = {'JennerE':{'first_name': 'Edward', 'last_name': 'Jenner', 'birth_year': 1749},'SnowJ':{'first_name': 'John', 'last_name': 'Snow', 'birth_year': 1813}}

In [22]:
print(english_scientists)

{'JennerE': {'first_name': 'Edward', 'last_name': 'Jenner', 'birth_year': 1749}, 'SnowJ': {'first_name': 'John', 'last_name': 'Snow', 'birth_year': 1813}}


In [23]:
print(english_scientists['SnowJ'])

{'first_name': 'John', 'last_name': 'Snow', 'birth_year': 1813}


* You can't get the value of a nonexistant key in a dictionary

In [24]:
print(english_scientists['PasteurL'])

KeyError: 'PasteurL'

* So instead of potentially creating a KeyError, you can check that a key exists, before using it

In [25]:
'PasteurL' in english_scientists

False

In [27]:
'SnowJ' in english_scientists

True

* This can come in handy. Lets imagine we have a list of scientists, and we want to know all the information we have on them.  
We could write some code to access our dictionary and print it out.

In [33]:
scientists = ['SnowJ','JennerE','PasteurL']
for scientist in scientists:
    print(english_scientists[scientist])

{'first_name': 'John', 'last_name': 'Snow', 'birth_year': 1813}
{'first_name': 'Edward', 'last_name': 'Jenner', 'birth_year': 1749}


KeyError: 'PasteurL'

* Unfortunately we get that ugly KeyError again. We probably should check to see if a key exists before accessing it.

In [32]:
scientists = ['SnowJ','JennerE','PasteurL']
for scientist in scientists:
    if scientist not in english_scientists:
        print("Uh-oh")
    else:
        print(english_scientists[scientist])

{'first_name': 'John', 'last_name': 'Snow', 'birth_year': 1813}
{'first_name': 'Edward', 'last_name': 'Jenner', 'birth_year': 1749}
Uh-oh


* We can get a list of all the keys in a dictionary

In [34]:
scientists = english_scientists.keys()

In [35]:
print(scientists)

['JennerE', 'SnowJ']


* We can also get all the values

In [36]:
scientist_info = english_scientists.values()

In [37]:
print(scientist_info)

[{'first_name': 'Edward', 'last_name': 'Jenner', 'birth_year': 1749}, {'first_name': 'John', 'last_name': 'Snow', 'birth_year': 1813}]


## Tuples

Tuples are another Python data type. They are similar to lists, but are immutable.

* Imagine we want to represent a point in 3D space. We could use a tuple for this, where the first element is x, the second is y, and the third is z.

In [10]:
point = (2.5,3.0,16)

* We can access the elements in the tuple

In [12]:
point[0]

2.5

* But we can't change them

In [13]:
point[0] = 5

TypeError: 'tuple' object does not support item assignment

In [14]:
point.append(4)

AttributeError: 'tuple' object has no attribute 'append'

## Sets

Sets are another data type. They hold an unordered, immutable collection of unique items.

In [17]:
s = set([1,2,3,4,1,2])

In [18]:
print(s)

set([1, 2, 3, 4])


In [19]:
ns = set([1,2,10,11,12])

* You can use mathematical set operations on sets, like intersection, union or difference

* Difference

In [20]:
ns-s

{10, 11, 12}

* Union

In [21]:
ns & s

{1, 2}

* Intersection

In [22]:
ns | s

{1, 2, 3, 4, 10, 11, 12}

## True and False

In Python there are multiple ways something can be False, and multiple ways something can be True. Empty things and zero values are False, while non-empty things and non-zero values are True.

* Empty and non-empty Arrays

In [25]:
bool([])

False

In [28]:
bool([1])

True

* Empty and non-empty Dictionaries

In [29]:
bool({})

False

In [31]:
bool({'one':1})

True

* Empty and non-empty Strings

In [38]:
bool("")

False

In [39]:
bool("something")

True

* Zero and non-zero values

In [33]:
bool(0)

False

In [34]:
bool(0.0)

False

In [35]:
bool(0.1)

True

In [36]:
bool(1)

True

* The None type is also "Falsey"

In [40]:
bool(None)

False

## Python 2 vs Python 3

Python 2 and Python 3 are different from each other, and not necessarily 100% compatible.  
It is possible to write code that works with both Python 2 and Python 3.  
One of the main difference is the print statement/function.

* In Python 2 printing is a statement

In [2]:
print "This is Python 2 syntax"

This is Python 2 syntax


* In Python 3 printing is a function

In [3]:
print("This is Python 3 syntax")

This is Python 3 syntax


* There are special things in Python 3 that won't work in Python 2, like the sep argument

In [4]:
print("This","is","Python","3","syntax",sep=" ")

SyntaxError: invalid syntax (<ipython-input-4-0a3dc5b66af5>, line 1)

* However we can import Python 3 print function into Python 2, by adding the line `from __future__import print_function` to our code

In [3]:
from __future__ import print_function

In [4]:
print("This","is","Python","3","syntax",sep=" ")

This is Python 3 syntax


## Files

Working with files is something that is important in bioinformatics, as it is so common.  
In Python it is pretty easy to work with files.

* First we will open a file, and write a string to it

In [5]:
file = open('simplefile','w')
file.write('This is a simple text file.\n')
file.write('It only has two lines.')
file.close()

In [7]:
%ls

LectureReview2-16-16.ipynb  README.md                   simplefile


* As you can see, a new file called `simplefile` has been created

* Now we will read in `simplefile`

In [9]:
file = open('simplefile','r')
print(file.readline())
print(file.readline())
file.close()

This is a simple text file.

It only has two lines.


* There is a better way to loop through the lines of a file.  
Instead of calling `readline` every time, we can use the `for ... in ...` syntax

In [11]:
file = open('simplefile','r')
for line in file:
    print(line)
file.close()

This is a simple text file.

It only has two lines.


* There is also a better way to open files.  
If you use the `with open ... as ...` syntax, the file will automatically be closed.

In [12]:
with open('simplefile', 'r') as file:
    for line in file:
        print(line)

This is a simple text file.

It only has two lines.


## pickle

The pickle module in Python is used to store native Python objects in files.  
This can be useful if you don't want to covert the object (say a dictionary) to a string before saving it.

* We will use pickle to store our english_scientists dictionary in a new file

In [15]:
import pickle

In [16]:
with open('scientist_file','w') as file:
    pickle.dump(english_scientists,file)

* If we take a look a the new file, we will see that it has all the information, but is stored in an interesting format

In [19]:
%cat scientist_file

(dp0
S'JennerE'
p1
(dp2
S'first_name'
p3
S'Edward'
p4
sS'last_name'
p5
S'Jenner'
p6
sS'birth_year'
p7
I1749
ssS'SnowJ'
p8
(dp9
g3
S'John'
p10
sg5
S'Snow'
p11
sg7
I1813
ss.

* We can load a pickled object back into Python

In [20]:
with open('scientist_file','r') as file:
    pickled_scientists = pickle.load(file)
    print(pickled_scientists)

{'JennerE': {'first_name': 'Edward', 'last_name': 'Jenner', 'birth_year': 1749}, 'SnowJ': {'first_name': 'John', 'last_name': 'Snow', 'birth_year': 1813}}
