# Class 7 - Modules & Dictionaries

Course objectives:
- Practice how to use loops with strings
- Know how to import and use python modules
- Know how to use built-in methods associated with the string type
- Know how to recognize and manipulate dictionaries
- Know how to use the native methods of dictionaries
- Find out how to explore the Python documentation
- Know how to combine the knowledge of previous classes to write simple scripts

# Python Modules

We want to code a dice in python that gives us a random number between 1 and 6. After a google search we find that there is a nice function that does exactly what we need: https://docs.python.org/3/library/random.html

We read the documentation and try to use the function as we have already done in the previous lessons.

In [7]:
stuff = ( 12,73,82)
print(type(stuff))

<class 'tuple'>


But we get an error. At the beginning of the documentation page we can read that `random` is a python module.


**But what is a python module?**

In order to use the `random` module we will have to import it into our notebook before we can use its associated functions. Let's give it a try:


In [None]:
import random as rd

In [None]:
rd.randrange(1,7)

1

In [None]:
for i in range(10):
  print(random.randrange(1,7))

NameError: ignored

The previous cell does what we need. At each execution we get a random number between 1 and 6.

Now let's see some more details about importing modules.

In [None]:
random

<module 'random' from '/usr/local/lib/python3.7/random.py'>

In [None]:
from random import randrange

In [None]:
randrange(1,6)

2

In [None]:
uniform(1.0, 2.0)

NameError: ignored

In [None]:
from random import *

In [None]:
uniform(1.0, 2.0)

1.0839080288777287

## The built-in methods associated with the string type

From now on I strongly invite you to refer to the Python documentation or to do your own research to find the information you need. https://docs.python.org/3/library/stdtypes.html#str

### count()

I invite you to read the documentation of this function and then try to use it on the following DNA strand to find the number of "G" bases in the strand.

In [None]:
DNA = "ACGTGCAGAGACATAGATAGATACAGATATATAGAGACATACGTGTCGTACGTTGTACGATGACAGATGA"

In [None]:
#Your code

Try to get the same result as in the previous box without using the built-in function.

In [None]:
#Your code

### find()

I invite you to read the documentation of this function and then try to use it on our DNA strand to find out if the "ATAG" pattern exists on our strand.

In [None]:
#Your code

Try to get the same result as in the previous box without using the built-in function.

In [None]:
#Your code
0123
 1234
  2345

### replace()

I invite you to read the documentation of this function and then try to use it on the following DNA strand to replace the "ATAG" pattern by "AAAA".

In [None]:
#Your code

Try to get the same result as in the previous box without using the built-in function.

In [None]:
#Your code

## Exercises

Write a script that transcribes the following DNA strand into its complementary RNA.    
To transcribe the DNA, each base of the DNA has to be converted in RNA bases using following principles: 
* base A turns into base U
*  base T turns into base A
*   base C turns into base G 
* base G turns into base C.

An RNA string is expected such as: `"UGCAC..."`


In [None]:
DNA = "ACGTGCAGAGACATAGATAGATACAGATATATAGAGACATACGTGTCGTACGTTGTACGATGACAGATG"

### Exercise 2

In biology, if we cut the previous RNA strand into portions of 3 bases, we have codons. 
Write a script that executes this process and stores all the codons in a list.
Sample output: `["UGC", "ACG", "UCU", ...]`

['UGC', 'ACG', 'UCU', 'CUG', 'UAU', 'CUA', 'UCU', 'AUG', 'UCU', 'AUA', 'UAU', 'CUC', 'UGU', 'AUG', 'CAC', 'AGC', 'AUG', 'CAA', 'CAU', 'GCU', 'ACU', 'GUC', 'UAC']


### Excercise 3

Write a script that gives a random codon from your list of codons from the previous exercise.

UGC


### Exercise 4

Write a script that counts the hamming distance between 2 strands of DNA. The hamming distance is the number of mutations between 2 DNA strands. A mutation is when, at the same index, the bases are unidentical. 

In [None]:
brin1 = "TCGTGCAGAGACATAGATACATACAGATATATAGAGTGATACGTGTCGTACGTAGTACGATGACAGATGA"
brin2 = "ACGTGGAGAGACATAGATAGATACAGATAAATAGAGACATACGTGTCCTACGTTGTACGATGAGTGTTGA"

11


### Exercise 5

In the following list of DNA strands, display for each strand their proportion of C bases / G bases. Which strand has the highest proportion? 

In [None]:
liste_brins = ["TCGTGCAGAGACATAGATACATACAGATATATAGAGTGATACGTGTCGTACGTAGTACGATGACAGATGA", "CGATGTTGATACATACAGACTAGATCAGATTTACAGATAGACGTGATATAGACTAGATC", "GTGATAGATACATATATATATAGACAGCAGCAGTGCGCTGAC", "TGATGTACAGCGCGCGATGTGTGAGCGCATATAACAGACAGATGACATATATACAGATAATATTTAGATGCAGATTAGTTGACAGTTTGACAGTAGATAGGGTA", "GGACGGCAGCGATGTGCCGCCCACACCCGGGATGT"]

[0.5555555555555556, 0.75, 0.7, 0.48148148148148145, 0.9230769230769231]
4
GGACGGCAGCGATGTGCCGCCCACACCCGGGATGT


### Exercise 6

Write a script that asks the user for a word/phrase and then displays that phrase in the Spongebob meme format.    
For example, "Hello" turns into "h E l L o". (Hint: input)

In [None]:
#Your code

### Exercise 7

Write a script that finds all prime numbers smaller than 50.

In [None]:
#Your code

### Exercise 8 

Given an unsorted list of names, write a script that prints the names alphabetically. For each name also print its position in the initial (unsorted) list and the length of the name.

In [None]:
names =['Mark', 'Amber', 'Todd', 'Anita', 'Sandy', 'Jon', 'Bill', 'Maria', 'Jenny', 'Jack']

In [None]:
#Your code

### Exercise 9

Write a script that draws a tree of height n.

In [None]:
#Example tree size n = 3:
#
#  * 
# ***
#*****
#  *

In [None]:
#Example tree size n = 7:
#      *
#     ***
#    *****
#   *******
#  *********
# ***********
#*************
#      *

In [None]:
#Example tree size n = 11:
#          *
#         ***
#        *****
#       *******
#      *********
#     ***********
#    *************
#   ***************
#  *****************
# *******************
#*********************
#          *

In [None]:
#Your code

### Exercise 10

Here is the first page of harry potter and the philosopher's stone. We will try to find the most frequent words in this first page. To do this we must first clean up the text.

In [None]:
harry_potter_page = """Mr. and Mrs. Dursley, of number four, Privet Drive, 
were proud to say that they were perfectly normal, 
thank you very much. They were the last people you’d 
expect to be involved in anything strange or 
mysterious, because they just didn’t hold with such 
nonsense. 

Mr. Dursley was the director of a firm called 
Grunnings, which made drills. He was a big, beefy 
man with hardly any neck, although he did have a 
very large mustache. Mrs. Dursley was thin and 
blonde and had nearly twice the usual amount of 
neck, which came in very useful as she spent so 
much of her time craning over garden fences, spying 
on the neighbors. The Dursley s had a small son 
called Dudley and in their opinion there was no finer 
boy anywhere. 

The Dursleys had everything they wanted, but they 
also had a secret, and their greatest fear was that 
somebody would discover it. They didn’t think they 
could bear it if anyone found out about the Potters. 
Mrs. Potter was Mrs. Dursley’s sister, but they hadn’t 
met for several years; in fact, Mrs. Dursley pretended 
she didn’t have a sister, because her sister and her 
good-for-nothing husband were as unDursleyish as it 
was possible to be. The Dursleys shuddered to think 
what the neighbors would say if the Potters arrived in 
the street. The Dursleys knew that the Potters had a 
small son, too, but they had never even seen him. 

This boy was another good reason for keeping the 
Potters away; they didn’t want Dudley mixing with a 
child like that. """

First, we need to put all the text in lower case.

In [None]:
#Your code

Then we need to remove punctuation and special characters.

In [None]:
#Your code

Then we will split the paragraph into individual words and we will create a list of all individual words (hint:split).

In [None]:
#Your code

Finally we can count the frequency of appearance of the words (hint: module counter).

In [None]:
#Your code



---



# Dictionaries

In [None]:
d = {} # empty dictionaries

In [None]:
type(d)

dict

In [None]:
my_dict = {'1': 20, '2': 3, '3': 16} # the dictionaries can associate a value to a key through pairs with the format key:value

In [None]:
my_dict['3'] #we acess values through keys

16

In [None]:
my_dict['12'] = 15 # adding a new key-value pair
my_dict

{'1': 20, '12': 15, '2': 3, '3': 16}

In [None]:
my_dict['2'] = 1 # modifing a value
my_dict

{'1': 20, '2': 1, '3': 16, '12': 15}

In [None]:
# each key must be unique but some values can be identical
{
 'hello': 'bonjour', 
 'bye': 'au revoir', 
 'thank you': 'merci',
 'cat': 'chat',
 'nine': 'neuf',
 'new': 'neuf'
 } 

{'bye': 'au revoir',
 'cat': 'chat',
 'hello': 'bonjour',
 'new': 'neuf',
 'nine': 'neuf',
 'thank you': 'merci'}

In [None]:
# the values can take several types (int, float, str, list...)
fr_to_en = {
 'bonjour': 'hello', 
 'au revoir': 'bye', 
 'merci': 'thank you',
 'chat': 'cat',
 'neuf': ['nine', 'new'],
 } 

In [None]:
len(fr_to_en)

5

In [None]:
fr_to_en.items() # access key-value pairs

dict_items([('bonjour', 'hello'), ('au revoir', 'bye'), ('merci', 'thank you'), ('chat', 'cat'), ('neuf', ['nine', 'new'])])

In [None]:
fr_to_en.keys() # access the keys

dict_keys(['bonjour', 'au revoir', 'merci', 'chat', 'neuf'])

In [None]:
fr_to_en.values() # access the values

dict_values(['hello', 'bye', 'thank you', 'cat', ['nine', 'new']])

## Questions

In [None]:
d = {1:"a", 2:"b", 3:"c"}

In [None]:
d[2]

In [None]:
d[0]

In [None]:
d[2] = "e"

In [None]:
d["b"]

In [None]:
d.append("f")

## Exercices

#### Exercise1 
You are the HR manager of a company that has just bought a competitor company. You have 2 employee files (name: [age, department, date of arrival]) that you have to put together in one dictionary.

In [None]:
employees1 = {'Michel Iredant': [42, 'Sales', '10-01-2012'], 'Pat Miregal': [39, 'Consulting', '01-12-2017'], 'Jane Saitaki': [27, 'Direction', '03-06-2020'], 'Ed Lemans': [58, 'Communication', '29-03-2021']}
employees2 = {'Mav Davis': [21, 'Reception', '07-02-2020'], 'Rone Travis': [63, 'Management', '01-10-2012'], 'Yosh Loera': [35, 'Sales', '19-11-2015']}

# here

#### Exercise 2
Write a script to check if a key already exists in a dictionary.

In [None]:
# here

#### Exercise 3
Using the following dictionary, output a list of the codons translated into amino acids. 

In [None]:
arn = ['CCU', 'ACG', 'GCU', 'CUU', 'GUA', 'UGG', 'CUU', 'GCU', 'UGA']
arn_to_aa = {'CCU':"Proline", 'ACG':"Threonine", 'GCU':"Alanine", 'CUU':"Proline", 'GUA':"Leucine", 'UGG':"Tryptophan", 'UGA': "STOP"}

### Exercise 4
For know we studied two different types of collections in python that are: lists and dictionaries. But other collection types also exists such as: set and tuples.
To better understand these 4 types of collection, use the interenet to answer the following questions, for lists, dictionaries, sets and tuples.

* What does the collection represent ?
* Is the collection type mutable ? 
* Is the collection type ordered ? 
* How can I create an empty collection ? 
* How can I add an element ? 
* How can I acess an element ?
* How can I modifiy an element ? 
* How can I delete an element ? 
* What are the other important methods of this collection.
* Are there specific uses cases where this collection is useful ?

#### Exercise 5
Display the keys of the dictionary obtained in exercise 1, in order of age.

In [None]:
# here

#### Exercise 6
From the dictionary obtained in exercise 1, calculate the average age of the employees.

In [None]:
#here

#### Exercise 7
From the following text, generate a dictionary containing the frequence of each word.

In [None]:
cyrano = "En variant le ton, — par exemple, tenez : \
Agressif : « Moi, monsieur, si j’avais un tel nez, \
Il faudrait sur-le-champ que je me l’amputasse ! » \
Amical : « Mais il doit tremper dans votre tasse ! \
Pour boire, faites-vous fabriquer un hanap ! » \
Descriptif : « C’est un roc !… c’est un pic !… c’est un cap ! \
Que dis-je, c’est un cap ?… C’est une péninsule ! » \
Curieux : « De quoi sert cette oblongue capsule ? \
D’écritoire, monsieur, ou de boîte à ciseaux ? » \
Gracieux : « Aimez-vous à ce point les oiseaux \
Que paternellement vous vous préoccupâtes \
De tendre ce perchoir à leurs petites pattes ? » \
Truculent : « Çà, monsieur, lorsque vous pétunez, \
La vapeur du tabac vous sort-elle du nez \
Sans qu’un voisin ne crie au feu de cheminée ? » \
Prévenant : « Gardez-vous, votre tête entraînée \
Par ce poids, de tomber en avant sur le sol ! » \
Tendre : « Faites-lui faire un petit parasol \
De peur que sa couleur au soleil ne se fane ! »"

# here

#### Exercise 8
Write a program to generate a dictionary containing n entries, i.e. n pairs (i, i*i) such that i is an integer between 1 and n. 

In [None]:
# here