# Data containers and control flow

This section covers:
A) Data containers
B) Control Flow Tools

# Python Data containers

Python includes several built-in container types:
1. List
2. Tuple
3. Set
4. Dictionary
5. String**

Tuple, List and String use similar syntax for access their content.
The below code shows this shared syntax.

In [1]:
# We can create Tuple, list and string as below:

# Tuples
tu = (40, "then", 4.98, 50, 1)

# List
li = [40, "then", 4.98, 50, 1]

# String 
st = "Hellooo"

# Access individal members using array notation
print("First item in the tuple")
print(tu[0]) # First item in the tuple

print("Second item in the list")
print(li[1]) # Second item in the list

print("5th item in the string")
print(st[5]) # 5th item in the string

First item in the tuple
40
Second item in the list
then
5th item in the string
o


# Lists

A list is the Python equivalent of an array, but is resizeable and can contain elements of different types:

In [4]:
numls = [3, 1, 2, 0, 3.8]   # Create a list
print(numls, numls[2])
print(numls[4])
print(numls[-1])        # Negative indices count from the end of the list; prints "2"

[3, 1, 2, 0, 3.8] 2
3.8
3.8


In [5]:
numls[2] = "gene"    # Lists can contain elements of different types
print(numls)

[3, 1, 'gene', 0, 3.8]


In [6]:
numls.append("Cell") # Add a new element to the end of the list
print(numls)

[3, 1, 'gene', 0, 3.8, 'Cell']


In [7]:
x = numls.pop()     # Remove and return the last element of the list
print(x, numls) 

Cell [3, 1, 'gene', 0, 3.8]


In [26]:
li = ["abc", 22, 4.34, 23]

# Add new item to the end of a list
li.append("newsd")

# Add new item at a specific location of a list
li.insert(2, "newinsert")

# Extend a list with another list
li.extend([345, 78, 90])
li

['abc', 22, 'newinsert', 4.34, 23, 'newsd', 345, 78, 90]

# Slicing

In addition to accessing list elements one at a time, Python provides concise syntax to access sublists; this is known as slicing:

In [10]:
nums = [3, 1, 2, 0, 3.8]    # range is a built-in function that creates a list of integers
print(nums)        # Prints "[0, 1, 2, 3, 4]"
print(nums[2:4])   # Get a slice from index 2 to 4 (exclusive); prints "[2, 3]"
print(nums[2:])    # Get a slice from index 2 to the end; prints "[2, 3, 4]"
print(nums[:2])    # Get a slice from the start to index 2 (exclusive); prints "[0, 1]"
print(nums[:])     # Get a slice of the whole list; prints ["0, 1, 2, 3, 4]"
print(nums[:-1])   # Slice indices can be negative; prints ["0, 1, 2, 3]"
nums[2:4] = [8, 9] # Assign a new sublist to a slice
print(nums)        # Prints "[0, 1, 8, 9, 4]"

[3, 1, 2, 0, 3.8]
[2, 0]
[2, 0, 3.8]
[3, 1]
[3, 1, 2, 0, 3.8]
[3, 1, 2, 0]
[3, 1, 8, 9, 3.8]


# Loops

You can loop over the elements of a list like this:

In [12]:
genes = ["TP53", "ATF2", "A2M"]

# Loop the list
for gene in genes:
    print(gene)

TP53
ATF2
A2M


# List comprehensions

When programming, frequently we want to transform one type of data into another. 
As a simple example, consider the following code that computes square numbers:

In [13]:
nums = [0, 1, 2, 3, 4]
squares = []

for x in nums:
    squares.append(x ** 2)
print(squares)

[0, 1, 4, 9, 16]


You can make this code simpler using a list comprehension.

In [14]:
nums = [0, 1, 2, 3, 4]
squares = [x ** 2 for x in nums]
print(squares)

[0, 1, 4, 9, 16]


List comprehensions can also contain conditions:

In [15]:
nums = [0, 1, 2, 3, 4]
even_squares = [x ** 2 for x in nums if x % 2 == 0]
print(even_squares)

[0, 4, 16]


# Tuples

A tuple is an (immutable) ordered list of values. A tuple is in many ways similar to a list; one of the most important differences is that tuples can be used as keys in dictionaries and as elements of sets, while lists cannot. Consider the below code:

In [16]:
d = {(x, x + 1): x for x in range(10)}  # Create a dictionary with tuple keys
t = (5, 6)       # Create a tuple
print(type(t))
print(d[t])     
print(d[(1, 2)])

<class 'tuple'>
5
1


However, tuple does not allow item assignment like list. 
For example:

In [17]:
li = ("abc", 22, 4.34, 23)
li[1] = 983
li

TypeError: 'tuple' object does not support item assignment

# Shared syntax

Note that all the things above about get elements, or woking with a list applies to tuples too.

# Dictionaries

A dictionary stores (key, value) pairs. 
You can use it like this:

In [25]:
gg = {"TP53": "suppresor", "NOTCH1": "oncogene"}  # Create a new dictionary with some data
print(gg["TP53"])       # Get an entry from a dictionary; prints "suppresor"
print("NOTCH1" in gg)    # Check if a dictionary has a given key; prints "True"

suppresor
True


In [28]:
gg["A2M"] = "Unknown"    # Set an entry in a dictionary
print(gg["A2M"])          # Prints "Unknown"

Unknown


In [29]:
# You get error if you try to access a missing key
print(gg["FOXA1"])  # KeyError: 'FOXA1' not a key of gg

KeyError: 'FOXA1'

In [30]:
# You can use dict methods too
print(gg.get("TP53", "N/A"))  # Get an element with a default; prints "N/A"
print(gg.get("FOXA1", 'N/A'))    # Get an element with a default; prints "wet"

suppresor
N/A


In [31]:
# Remove an element from a dictionary
del gg["NOTCH1"]       
print(gg.get("NOTCH1", 'N/A')) # "NOTCH1" is no longer a key; prints "N/A"

N/A


# Loop dict

It is easy to iterate over the keys in a dictionary:

In [34]:

di = {
    "cyto" : 10,
    "nucl" : 4,
    "ribo" : 7,
    "endop" : 2
}


for org in di:
    part = di[org]
    print('A {} has {} counts'.format(org, part))

A cyto has 10 counts
A nucl has 4 counts
A ribo has 7 counts
A endop has 2 counts


In [36]:
# If you want access to keys and their corresponding values, use the iteritems method:

for org, count in di.items():
    print('A {} has {} counts'.format(org, count))

A cyto has 10 counts
A nucl has 4 counts
A ribo has 7 counts
A endop has 2 counts


# Shared syntax

Note that Dictionary comprehension is the same as lists.

# Sets

A set is an unordered collection of distinct elements.
Consider the below code:

In [37]:
# Set
se = {2, 3, 5, 5, 6, 8}

# Notice that duplictes are removed
se

{2, 3, 5, 6, 8}

In [38]:
print(3 in se)   # Check if an element is in a set; prints "True"
print(20 in se)  # prints "False"

True
False


In [39]:
se.add(20)      # Add an element to a set
print(20 in se)
print(len(se))       # Number of elements in a set;

True
6


In [40]:
se.add(2)       # Adding an element that is already in the set does nothing
print(len(se))       
se.remove(20)    # Remove an element from a set
print(len(se))     

6
5


# Shared syntax

Note that set has the same loop and comprehension method as lists

-----------------------------

# Control Flow tools

Python uses the usual flow control statements known from other languages, with some twists.

1. While loop
2. If statements
3. For statements
4. Break and continue statements
5. Pass statements

We used some of these already in the above codes.

In [41]:
# While loop

# Fibonacci series, where the sum of two elements defines the next
a, b = 0, 1
while a < 10:
    print(a)
    a, b = b, a+b

0
1
1
2
3
5
8


In [42]:
# If statements

# Create a lottery machine
ticket = int(input("Please enter your lucky number : "))

if ticket < 50 and ticket > 10 :
    loto = "£50"
    
elif ticket < 200 and ticket > 100 :
    loto = "£80"

else:
    loto = "£1"
print("Congratulations you won {}".format(loto))

Please enter your lucky number : 30
Congratulations you won £50


In [43]:
# For statments

# Measure some strings and create new list from it
words = ["dog", "wooden", "special"]
lengths = [] # This is am empty list

for z in words:
    lengths.append(len(z))
    print(z, len(z))
    
print(":::::::::::::")
print(lengths)

dog 3
wooden 6
special 7
:::::::::::::
[3, 6, 7]


In [44]:
# For statements

# Fill in a list by location
words = ["dog", "wooden", "special"]
copyW = [None] * len(words) # This is am empty list with specific length


for i in range(len(words)):
    print(i)
    print(words[i])
    copyW[i] = words[i]
    
print(":::::::::::::")
print(copyW)

0
dog
1
wooden
2
special
:::::::::::::
['dog', 'wooden', 'special']


# break and continue Statements

The break statement - breaks out of the innermost enclosing for or while loop.
The continue statement - continues with the next iteration of the loop

Loop statements may have an else clause; it is executed when the loop terminates through exhaustion of the iterable (with for) or when the condition becomes false (with while), but not when the loop is terminated by a break statement.

In [45]:
# break statment
for n in range(2, 10):
        for x in range(2, n):
            if n % x == 0:
                print(n, 'equals', x, '*', n//x)
                break
        else:
            # loop fell through without finding a factor
            print(n, 'is a prime number')

2 is a prime number
3 is a prime number
4 equals 2 * 2
5 is a prime number
6 equals 2 * 3
7 is a prime number
8 equals 2 * 4
9 equals 3 * 3


In [None]:
# continue statment

for num in range(2, 10):
        if num % 2 == 0:
            print("Found an even number", num)
            continue
        print("Found an odd number", num)

----

# Self practice

You have now learnt data containers and control flow tools.
Complete the below tasks to test your knowledge

1. Create a dictionary or dictionaries containing anything you want as keys and as values (Hint: Dictionary code block from line 25).
2. Create a list of lists containing anything. Then use a loop to print the item in the inner list.
3. Build a list containing any ten integers and any ten strings. Write a code to loop over the elements of the list and check if it is an integer or a string. For integer print "Number found" and for string print "Text found".
4. Create a dictionary containing any 10 genes as key and any 20 base DNA sequence as values. Next write a code to loop over the dictionary items and count the number of each base. Add this infromation to new dictionary using the orginal gene name as key and the counts as values. Print the final new dictionary. Essentially: G1 = "GCGATA..." to G1 = "A=2, G=2, C=1, T=0"
5. Write a set of code that uses while loop, if statment and for loop to acheive a task.
6. Write 10 lines of code working with all the python data continers.



# Remember to use markdown to comment on what you are doing.