# 8. Dictionaries
*Set relation between x and y*

## 8.1 Introduction

So far we've seen variables that store one value or a series of values (see [section 5](5_Lists_Tuples_Sets.ipynb): lists, tuples and sets). There is another way of storing information where you associate one variable with another; in Python this is called a dictionary. Dictionaries provide a very useful way of quickly connecting variables to each other.



## 8.2 Dictionary creation & usage

It is best to think of a dictionary as a set of *key:value* pairs, with the requirement that the keys are unique (within one dictionary). Dictionaries are initiated by using curly brackets {}, placing a comma-separated list of *key:value* pairs adds initial *key:value* pairs to the dictionary.

You can then recall or add values by using square brackets [ ]. You can also use the `get()`-method. 


In [None]:
myDictionary = {'A': 'Ala', 'C': 'Cys', 'D': 'Asp'}
oneLetterCode = 'A'
 
print(oneLetterCode, myDictionary[oneLetterCode])
 
myDictionary['E'] = 'Glu'
print(myDictionary)
print(myDictionary.get('C'))

The reference value is called a dictionary *key*, which refers to a *value*. Dictionaries, like lists, have several useful built-in methods. The most frequently used are listed here below:
- `keys()`	to list the dictionary's keys
- `values()` to list the values in the dictionary
- `get()`	call the value of a specified key
- `pop()`	to remove the specified key and its values

In [None]:
myDictionary = {'A': 'Ala', 'C': 'Cys', 'D': 'Asp', 'E': 'Glu'}
 
print(list(myDictionary.keys()))
print(list(myDictionary.values()))

print(myDictionary.get('C'))

myDictionary.pop('E')
print(myDictionary)

If you try to access a key that doesn't exist, Python will give an error:

In [None]:
myDictionary = {'A': 'Ala', 'C': 'Cys', 'D': 'Asp', 'E': 'Glu'}
 
print(myDictionary['B'])

You should therefore always check whether a key exists:


In [None]:
# Newlines don't matter when initialising a dictionary...
myDictionary = {
     'A': 'Ala',
     'C': 'Cys',
     'D': 'Asp',
     'E': 'Glu',
     'F': 'Phe',
     'G': 'Gly',
     'H': 'His',
     'I': 'Ile',
     'K': 'Lys',
     'L': 'Leu',
     'M': 'Met',
     'N': 'Asn',
     'P': 'Pro',
     'Q': 'Gln',
     'R': 'Arg',
     'S': 'Ser',
     'T': 'Thr',
     'V': 'Val',
     'W': 'Trp',
     'Y': 'Tyr'}

if 'B' in myDictionary.keys():
    print(myDictionary['B'])
else:
    print("myDictionary doesn't have key 'B'!")

---
### 8.2.1 Exercise

Use a dictionary to track how many times each amino acid code appears in the following sequence:
```
SFTMHGTPVVNQVKVLTESNRISHHKILAIVGTAESNSEHPLGTAITKYCKQELDTETLGTCIDFQVVPGCGISCKVTNIEGLLHKNNWNIED  
NNIKNASLVQIDASNEQSSTSSSMIIDAQISNALNAQQYKVLIGNREWMIRNGLVINNDVNDFMTEHERKGRTAVLVAVDDELCGLIAIADT
```
Tip: use the one-letter code as key in the dictionary, and the count as value.

---

In [None]:
# Use a dictionary to track how many times each amino acid code appears in the following sequence:
# SFTMHGTPVVNQVKVLTESNRISHHKILAIVGTAESNSEHPLGTAITKYCKQELDTETLGTCIDFQVVPGCGISCKVTNIEGLLHKNNWNIEDNNIKNASLVQIDASNEQSSTSSSMIIDAQISNALNAQQYKVLIGNREWMIRNGLVINNDVNDFMTEHERKGRTAVLVAVDDELCGLIAIADT
# Tip: use the one-letter code as key in the dictionary, and the count as value. 
mySequence = "SFTMHGTPVVNQVKVLTESNRISHHKILAIVGTAESNSEHPLGTAITKYCKQELDTETLGTCIDFQVVPGCGISCKVTNIEGLLHKNNWNIEDNNIKNASLVQIDASNEQSSTSSSMIIDAQISNALNAQQYKVLIGNREWMIRNGLVINNDVNDFMTEHERKGRTAVLVAVDDELCGLIAIADT"
 
# First way to do this, using sets (condensed)
aminoAcidCount = {}
myUniqueAminoAcids = set(mySequence)
for aaCode in myUniqueAminoAcids:
    print("Amino acid {} occurs {} times.".format(aaCode,mySequence.count(aaCode)))
    aminoAcidCount[aaCode] = mySequence.count(aaCode)

In [None]:
# Another way to do this, a little bit more elaborate and using the myDictionary as a reference for iteration

myDictionary = {
     'A': 'Ala',
     'C': 'Cys',
     'D': 'Asp',
     'E': 'Glu',
     'F': 'Phe',
     'G': 'Gly',
     'H': 'His',
     'I': 'Ile',
     'K': 'Lys',
     'L': 'Leu',
     'M': 'Met',
     'N': 'Asn',
     'P': 'Pro',
     'Q': 'Gln',
     'R': 'Arg',
     'S': 'Ser',
     'T': 'Thr',
     'V': 'Val',
     'W': 'Trp',
     'Y': 'Tyr'}

lengthDict = len(myDictionary.keys())
for aa in range(lengthDict):
    aaCode = list(myDictionary.keys())[aa]
    aaCount = mySequence.count(aaCode)
    print("Amino acid {} occurs {} times.".format(aaCode,aaCount))

## 8.3 More with dictionaries
You can also make a dictionary refer to lists, or other dictionaries:

In [None]:
# Create a dictionary with a list of names and a number-based dictionary
# with an identification number referring to information about a person 
mainDict = {}
mainDict['myNames'] = ['Jack','Joe','Anne','Julia','Dennis','Yuri','Mel']
# Dictionaries can be nested: key within key
mainDict['myIds'] = {5343:  ('Male',  'Jack', 34),
                     3432:  ('Female','Anne', 25),
                     7345:  ('Male',  'Yuri', 53)}
 
# Loop over the values in myList in the dictionary
for name in mainDict['myNames']:

    # Check whether we find this name in the myIds dictionary
    nameFound = False

    # Loop over all the information for the id numbers,
    # and check whether the name matches
    for idNumber in mainDict['myIds'].keys():
        # Get the information out - take care here to not use 
        # the variable 'name' again or we will overwrite the original value!
        (gender,nameInDict,age) = mainDict['myIds'][idNumber]
 
        # Check whether the names match
        if name == nameInDict:
            print ("{} has ID number {}".format(name,idNumber))
            nameFound = True
            break
 
    # If no match was found, print out that this person has no ID number
    if not nameFound:
        print ("No ID number found for {}".format(name))

You can, however, only use variables that cannot change keys (so tuples are OK, lists are not), and keys have to be unique: if you add a key that already exists, the old entry will be overwritten:

In [None]:
mySample = {'pH': 5.6, 'temperature': 288.0, 'volume': 200, 'name': 'calibration_1'}
  
print(mySample['pH'])

mySample['pH'] = 7.0
print(mySample['pH'])

# This is fine
moleculeKey = ('protein','mySmallPeptide') # tuple
mySample[moleculeKey] = 'ASKLPIITREWSDDN'
print(mySample)

# This will fail...
otherMoleculeKey = ['DNA','myDna'] # list
mySample[otherMoleculeKey] = 'TGCATTGCCA'

## 8.4 Next session

Go to our [next chapter](9_Files.ipynb).